Today’s scientific and business processes heavily depend on fast and accurate data analysis. Data scientists are routinely overwhelmed by the effort needed to manage the volumes of data produced. As general-purpose data management software is often inefficient, hard to manage, or too generic to serve today's applications, businesses increasingly turn to specialised data management software, which can only handle one data format, and then resort to data integration solutions. With the exponential growth of dataset size and complexity, however, data format-specific solutions no longer scale for efficient analysis, thereby slowing down the cycle of analysing and understanding the data, and making decisions. I will illustrate the different nature of problems we face when managing heterogeneous datasets, and how these translate to fundamental challenges for the data management community. Then I will introduce new technologies inspired by these challenges, which overturn long-stangding assumptions, enable meaningful and timely results, and advance discovery.