Python is the lingua franca for data analytics and machine learning. Its superior productivity makes the preferred tool for prototyping. However, traditional python packages are not necessarily designed to provide high performance and scalability for large datasets.
In this tutorial we start with a short introduction on how to get close to native performance with Intel-optimized packages like numpy, scipy and scikit-learn. The tutorial then focuses on getting high performance and scalability from multi-cores on a single machine to large clusters of workstations. It will demonstrate that it is possible to achieve performance and scalability similar to hand-tuned C++/MPI codes while utilizing the known productivity of python:
- High Performance Analytics Toolkit (HPAT) is used to compile and scale analytics codes using pandas/Python to bare-metal cluster performance. It compiles a subset of Python (Pandas/Numpy) to efficient parallel binaries with MPI, requiring only minimal code changes. HPAT is orders of magnitude faster than alternatives like Apache Spark.
- daal4py is a convenient Python API to Intel® DAAL (Intel® Data Analytics Acceleration Library). While its interface is scikit-learn-like its MPI-based engine allows to scale machine learning algorithms to bare-metal cluster performance with only minimal code changes.
- The tutorial will use HPAT and daal4py together to build an end-to-end analytics pipeline which scales to clusters.