$20 Million Grant Drives IDEAL Research in Data Science



By Casey Moffitt

A Chicago-based research coalition that includes researchers from Illinois Institute of Technology has been awarded a share of a five-year, $20 million grant from the National Science Foundation to accelerate innovations in data science. 

The NSF announced award winners of its Transdisciplinary Research in Principles of Data Science (TRIPODS) Phase II program, which brings together scientists and engineers from different research communities to further the theoretical foundations of data science through integrated research and training activities. Jinqiao “Jeffrey” Duan, professor of applied mathematics at Illinois Tech, and Binghui Wang, assistant professor of computer science at Illinois Tech, are part of the coalition that received a TRIPODS grant. 

TRIPODS is one initiative of NSF’s Harnessing the Data Revolution Big Idea, which is designed to stimulate discovery and innovation in data science algorithms, data infrastructure, and education and workforce development. 

“This institute provides a platform for conducting research and training in the mathematical foundation of data science, with expertise, motivation, and inspiration from people of diverse backgrounds,” Duan says. 

Duan and Wang will conduct data science research with the Institute for Data, Econometrics, Algorithms, and Learning (IDEAL), a consortium of more than 50 Chicago-area researchers from Northwestern University, the University of Chicago, the University of Illinois Chicago, and Toyota Technological Institute. IDEAL focuses its research on key aspects of data science foundations across computer science, electrical engineering, mathematics, and statistics in fields such as economics, operations research, and law. 

IDEAL researchers promise the work conducted with the TRIPODS funding will lead to new theoretical frameworks, models, mathematical tools, and algorithms for analyzing high-dimensional data, inference, and learning. The goal is to gain a better understanding of the foundations of data science and machine learning in emerging concerns such as reliability, fairness, privacy, and interpretability as data science interacts with society. 

Duan says his work will revolve around data-driven prediction of stochastic dynamical systems. 

“We may also like to add some relevant faculty members in this institute,” he says. “Moreover, all faculty members and students are welcome to participate in the activities of this data science institute.” 

IDEAL’s proposal also includes a strong public outreach component to impact research and educational infrastructures to engage a diverse population from underrepresented communities engaged in data science. This includes conducting public lectures and exhibits through a partnership with the Museum of Science and Industry, as well as conducting workshops with local high school teachers through a partnership with Math Circles of Chicago. Direct workshops with undergraduate and high school students are also a part of the strategy. 

“What we develop in this institute will be beneficial to public users of data,” Duan says. “For this reason, the grant also has an educational component—to provide training in data science for undergraduate and graduate students. In particular, a graduate student in one university can attend relevant data science courses (to be developed within this grant) at the other four universities and institutes.” 

The institute also plans direct engagement with industry through activities involving Google, other industry partners in the broader Chicago area, and applied data science institutes. 

“The NSF TRIPODS institutes will bring advances in data science theory that improve health care, manufacturing, and many other applications and industries that use data for decision making,” says Shekhar Bhansali, NSF division director for electrical, communications, and cyber systems. 

Phase II of the TRIPODS program continues to support the development of collaborative institutes to delve into foundational issues in data science, such as designing algorithms to analyze large, complex, noisy, and changing data sets. Some of these sets include historical biases and elements influenced by self-interested and possibly malicious parties, which creates a need for fair, ethical, and understandable results from complex data-driven decision-making processes. 

“The new 2022 TRIPODS awards address foundational challenges in data science at the core of data-driven discovery and decision making,” says Dilma Da Silva, NSF division director for Computing and Communication Foundations (CCF). “CCF is pleased to be able to support these impactful projects.”