Other Event

CSE Research Seminar: Large-scale mining: Graphs, streams, and beyond

Spyridon PapadimitriouIBM Research Staff MemberIBM
SHARE:

Dr. Papadimitriou holds a PhD from Carnegie Mellon University, and he is currently a Research Staff Member at T.J. Watson Research Center in Hawthorne, NY.
Data are becoming available in unprecedented volumes. This difference in
scale is difference in kind, presenting new opportunities
and challenges. For the most part, this talk focuses on graph mining.
We first present a method which can discover which groups or
communities of nodes are associated to each other, as well as the number
of these groups. Next, we show how to detect when the community
structure changes when the graph evolves over time. Our method scales
to large graphs and is suitable for incremental estimation in a streaming
setting. We demonstrate its effectiveness in a wide range of application
domains, including social networks and network traffic analysis, where it
produces meaningful patterns that agree with human intuition.
Next, the talk switches gears and presents our experiences and important
lessons while adapting and deploying this method on a large-scale
distributed processing infrastructure (Hadoop's HDFS and MapReduce), also
focusing on the entire, end-to-end mining process. Finally, we conclude
with some thoughts about general future directions in data mining.
Spiros Papadimitriou is a research staff member at IBM T.J. Watson. His
main interests are data mining for graphs and streaming data, clustering,
time series and systems for large-scale data processing. He has published
more than thirty papers on these topics in refereed conferences and
journals. He has three invited journal publications in best paper issues,
several book chapters and he has filed multiple patents. He was a Siebel
scholarship recipient in 2005 and received the best paper award in SDM 2008.
He has also been invited to give keynote talks on graph and social network
analysis (WAAMD 2008, and ADN 2009) and tutorials on time series stream
mining (University of Maine Summer School, 2008). He obtained his BSc in
Computer Science from the University of Crete, Heraclion and his MSc and
PhD degrees from Carnegie Mellon University.

Sponsored by

EECS - CSE Division