Speaker
Description
Data-driven discoveries are permeating critical fabrics of society. However, unreliable discoveries lead to decisions that can have far-reaching and catastrophic consequences on society, defense, and to individuals. This makes the dependability of data-science lifecycles producing discoveries and decisions a critical issue that requires a new holistic view and formal foundations. Furthermore, while the notion of dependability is well-studied in the computer-systems literature, challenges in data science push the boundary of existing knowledge into the unknown. This project, the Dependable Data-Driven Discovery (D4) Institute at Iowa State University, is advancing foundational research on ensuring that data-driven discoveries are of high quality. The D4 Institute advances the theoretical foundations of data science by fostering foundational research to enable understanding of the risks to the dependability of data-science lifecycles, formalizing the rigorous mathematical basis of the measures of dependability for data science lifecycles, and identifying mechanisms to create dependable data-science lifecycles. The institute is facilitating transdisciplinary training of a diverse cadre of data scientists through activities such as the Midwest Big Data Summer School and the TADS Lunch-n-Learn. Phase I focuses on a subset of the Data Science lifecycle and 4 risks (i.e., complexity, uncertainty, resource constraints, and freshness).