Each poster speaker can deliver a 2 min highlight.
In addition, a 10 min video is expected to be uploaded to the agenda by Oct 14.
Data-driven exploration has revolutionized science and led to the establishment of Data Science as a new discipline that integrates approaches from computer science -- including data management, visualization, machine learning -- statistics, applied mathematics, and many application domains. I will give my perspective of how the field emerged and evolved over the past decade, and the virtuous...
Chair: Prof. Adam Smith
- Prof. Eric Toberer, Colorado School of Mines, HDR Institute: Institute for Data Driven Dynamical Design
- Prof. Tanya Berger-Wolf, Ohio State University, HDR Institute: Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning
- Prof. Hamed Hassani, University of Pennsylvania, TRIPODS Phase II: EnCORE: Institute for...
MLCommons Research is described as a community to collaborate with and as a model for similar communities. Its working groups cover Algorithms, Datasets, Platforms, Storage and Science and Medical applications. MLCommons involves 62 companies, 6 DOE laboratories, 11 Universities with a flagship benchmark set MLPerf and the mission of “Accelerating machine learning innovation to benefit...
The Core Cyberinfrastructure (CI) Capabilities and Services is one of the six focus areas in the I-GUIDE, an NSF HDR Institute for Geospatial Understanding through an Integrative Discovery Environment. Its primary mission is to bridge a wide range of distributed, heterogenous and rapidly increasing geospatial datasets with convergence research to achieve a greater society resilience and...
A3D3 Institute, Accelerated Artificial Intelligence Algorithms for Data-Driven Discovery, aims to pursue next generation AI Algorithms combined with next generation processor technology to develop AI algorithms that can be run fast to solve real-time scientific problems with AI Domains: High Energy Physics, Multi-Messenger Astronomy, and Neuroscience. We will present Hardware-Algorithm...
An interdisciplinary team from the University of Florida and Florida Agricultural and Mechanical University are leading a project to enhance diversity, access, impact of a strong AI curriculum. Artificial intelligence is poised to make unprecedented impacts across all aspects of our society. Developing technical expertise in AI or relegating AI education to the computer and data science...
We consider the problem of sequential multiple hypothesis testing with nontrivial data collection cost. This problem appears, for example, when conducting biological experiments to identify differentially expressed genes in a disease process. This work builds on the generalized $\alpha$-investing framework that enables control of the false discovery rate in a sequential testing setting. We...
Data-driven discoveries are permeating critical fabrics of society. However, unreliable discoveries lead to decisions that can have far-reaching and catastrophic consequences on society, defense, and to individuals. This makes the dependability of data-science lifecycles producing discoveries and decisions a critical issue that requires a new holistic view and formal foundations. Furthermore,...
The Delaware And MiD-Atlantic Data Science Corps (PI Bianco) is an NSF HRD-sponsored, regional partnership between the University of Delaware (UD), Lincoln University (LU), and Delaware State University (DSU) aimed at creating an equitable, accessible program for undergraduate data science education that: (1) is accessible to students of any background with a focus on STEM preparation level;...
Group testing is the study of pooling strategies that allow the identification of a small set of k defective items among a population of n using a small number of pooled tests. State-of-the-art testing schemes have shown that \Theta(k log n) schemes are both necessary and sufficient for the purpose which provides large gains when k is small (sublinear in n). However, these schemes are not...
Led by an interdisciplinary team from the University of Tennessee at Chattanooga, Howard University, and Chattanooga State Community College, the proposed Anthropocentric Data Analytics for Community Enrichment (ADACE) program will develop a sustainable education and research platform for human-centric data science, where humans are either considered as the research subjects or regarded as a...
The goal of this project is to develop a team-based data science corps program for undergraduate students from Computer Science, Information Systems, and Business integrating both academic training as well as hands-on experience through real-world data science projects. This project is a collaborative effort with the University of Maryland Baltimore County as the coordinating as well as an...
Introducing the new NSF HDR DIRSE Institute Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning. The institute aims to establish a new field of science, imageomics: from images to biological traits using biology-structured machine learning.
Images are the most abundant, readily available source for documenting life on the planet. Ranging in...
The iTREDS program trains undergraduate students in data science through a lens of social responsibility and community engagement, including rigor and responsibility, ethics, society, and policy. The students also develop superskills in the areas of teamwork, working with stakeholders,ethics,communication, and entrepreneurship. The goal of this 15-credit program is to develop scholars with an...
The field of water and wastewater treatment (W/WWT) is brimming with data analysis opportunities, but many working in the field lack the skills needed to navigate and extract knowledge from this data. This project began in 2019 with the development of a prerequisite-free course in data science and a five-week summer undergraduate research program. Both were populated with real problems and...
The Tufts University T-Tripods Phase I Tripods institute supports interdisciplinary research and learning in the foundations of data science, fostering collaboration among researchers in computer science, mathematics and electrical and computer engineering departments at Tufts, as well as connecting to scientists and scholars in a wide range of application domains.
The three focus areas...
Experience of Data Science education and Ecosystem building as a center/institute leader within big university, and forward looking about advise to build a successful national-wise HDR Ecosystem.
Each poster speaker can deliver a 2 min highlight.
In addition, a 10 min video is expected to be uploaded to the agenda by Oct 14.
A3D3 aims to be a nexus for exchanging new ideas, algorithms and tools between scientific domains, AI communities and industry partners for AI-Hardware co-design. In this presentation, we will show efforts based on strong foundation on the Fast Machine Learning (FastML) community efforts. Our on-going programs on Postbaccalaurate Fellowships, Training, Education, and strong connection with...
The workforce demand for data analysts and data scientists exceeds the current capacity for higher education to produce this skilled workforce. Our overall goal is to develop scalable, portable data science education that can be readily incorporated into existing programs concentrating on STEM with ecology, biodiversity, and conservation. We will do this by creating multiple curricular data...
The Convergence Curriculum for Geospatial Data Science is an integrative curriculum to prepare students, scholars, and professionals to build the necessary knowledge, skills, and competencies to solve convergent problems without having to go through a series of multi-week regular courses. This multi-tiered curriculum starts with 5 Foundational Knowledge Threads to establish a common basis for...
This work proposes an integration of surrogate modeling and topology to significantly reduce the amount of data required to describe the underlying global dynamics of robot controllers, including closed-box ones. A Gaussian Process (GP), trained with randomized short trajectories over the state-space, acts as a surrogate model for the underlying dynamical system. Then, a combinatorial...
The Data Science Career Pathways in the Inland Empire (DS-PATH) is a partnership that brings together 4-year and 2-year Universities and Colleges with a common goal of creating flexible pathways that will equip underrepresented students to become skilled and knowledgeable professionals in Data Science (DS). The partnership consists of six Hispanic Serving Institutions and covers all three...
We study a Weiner process that is conditioned to pass through a finite set of points and consider the dynamics generated by iterating a sample path from this process. Using topological techniques we are able to characterize the global dynamics and deduce the existence, structure and approximate location of invariant sets. Most importantly, we compute the probability that this...
Distinguishing between global or macro patterns and local or micro fluctuations helps summarize the evolution of complex non-stationary dynamic systems. Herein, we focus on making distinctions between drift and shifts. Drift describes the micro-level evolution of a process. This may appear as variation about gradual trends. In contrast, shifts refer to discontinuities, rapid changes, or major...
The ultimate goal of our program is to provide interdisciplinary education and research opportunities in data and decisions science for undergraduate students who are experts in a core discipline of engineering or biology, but who are also proficient in the alternate discipline. We are training students with complementary disciplinary expertise that can address problems at the...
We established the Metropolitan Chicago Data science Corps (MCDC) in the Fall of 2021. MCDC is a partnership between five Chicago-area universities and local not-for-profit organizations. It serves data science needs of the organizations and provides real world data science questions, data sets and experience for data science students. Goals of MCDC are to advance data-driven decision making,...
The goal of this project is to develop a curricular framework for data science education and workforce development that is transferable between diverse institutions, so STEM-related programs can “plug and play” data science lessons with existing curricula without much overhead. These lessons will be created in conjunction with community stakeholders and industry partners to ensure a focus on...
The NSF Institute for Data-Driven Dynamical Design (ID4) aims to transform how scientists and engineers harness data when designing materials and structures. From chemistry to civil engineering, we seek to create platforms that accelerate the discovery of new mechanisms and dynamics through the complementary union of human and machine intelligence. Cross-cutting these efforts are efforts to...
In today’s interconnected world, disasters such as floods and droughts are rarely isolated events, and their cascading effects are often felt far beyond their locations of origin. The Institute for Geospatial Understanding through an Integrative Discovery Environment (I-GUIDE) creates an open platform for harnessing geospatial data to better understand interconnected interactions across...
We have generalized the multiscale basis dictionaries (e.g., the Haar-Walsh wavelet packet dictionary and local cosine dictionary), which were developed for digital signals and images sampled on regular lattices and have a proven track record of success (e.g., audio/image compression, feature extraction, etc.), to for the graph setting. Our previous such basis dictionaries (e.g., Generalized...
The Earth & Environmental Sciences (EES) produce vast amounts of data at a pace and on a scale that precipitate a need for EES researchers who are equipped with the technical data analytic skills required to work with large EES data sets. There are currently limited opportunities to learn these critical earth and environmental data science (EDS) skills leading to a gap between the demand for...
The I-GUIDE platform is designed to harness the vast, diverse, and distributed geospatial data at different spatial and temporal scales and make such data broadly accessible and usable to convergence research and education enabled by cutting-edge cyberGIS and cyberinfrastructure. The platform comprises composable and interoperable tools and cyberinfrastructure capabilities integrated through...
The National Data Mine Network launched in August 2022. Our students work on data-driven projects with our Corporate Partners and with faculty members. The Corporate Partners working with NDMN students this year include Bayer (2 projects), Convo, John Deere (2 projects), Indiana Family and Social Services Administration, Inogen, Lockheed Martin, Merck, Raytheon (2 projects), Sandia National...
Our institute is a multi-institution and transdisciplinary collaborative Phase II Institute for Data, Econometrics, Algorithms, and Learning (IDEAL), which focuses on key aspects of the foundations of data science. IDEAL will consolidate and amplify research devoted to the foundations of data science across all the major research-focused educational institutions in the greater Chicago area:...