Date: Friday, March 19, 2010, 11:30am - 12:30pm, SB 111
Title: Spatio-Temporal Data Mining: Discovering Flow Anomalies
Speaker: James M. Kang
Dept. Computer Science
University of Minnesota
Spatio-temporal data mining (STDM) is concerned with modeling and mining interesting, useful, and non-trivial patterns from massive geographic datasets. STDM is becoming increasingly important in societal application areas such as transportation engineering, climate-change science, water quality control, etc.
This talk explores the spatio-temporal pattern family of anomalies, specifically, the problem of detecting anomalies in flow networks. The problem of flow anomalies can be defined as follows: given a pair of sensors on a flow network, where each sensor has a continuous set of measurements of some variable, flow anomaly discovery aims to identify periods of time that are significantly mis-matched based on some threshold. Applications of flow anomaly discovery include localizing contaminant sources in river networks or traffic congestion sources in road networks. However, identifying flow anomalies is computationally hard because a single flow anomaly pattern may contain subsets that exhibit normal behavior (e.g., gaps within an oil spill). This violates one of the main principles in dynamic programming of requiring optimal sub-structure. Thus, we have proposed a novel SWEET (Smart Window Enumeration and Evaluation of persistent-Thresholds) approach that exploits ideas such as the algebraic nature of a statistical interest measure. Analytical and experimental analyses on both synthetic and real datasets show that the proposed approaches outperform na?ve alternatives.
Biography: James M. Kang's research interests are in the areas of spatio-temporal data mining and databases, with interdisciplinary applications in environmental sciences. In spatio-temporal databases, James has explored continuous reverse nearest neighbor queries and spatio-temporal sensor graphs. In spatio-temporal data mining, he has investigated flow anomalies (abstract above), teleconnected patterns, multi-scale multi-granular classification, and zonal co-locations. His novel research on flow anomalies and teleconnections grew out of interdisciplinary collaborations with environmental scientists at the University of Minnesota. In addition, James has helped organize the Minnesota Futures Symposium on Geoinformatics and has been an external reviewer for numerous conferences (e.g., IEEE ICDM, IEEE ICDE, SIG-KDD, etc.) and journals (e.g., IEEE TKDE, Geoinformatica, etc.).
James is currently a Ph.D. candidate in Computer Science at the University of Minnesota. He served as a visiting scientist at a USDOD research facility and as a B2B lead at Eastman Kodak. James received his B.S. degree in Computer Science at Purdue University and his M.S. degree in Computer Science at the Rochester Institute of Technology.