IML Machine Learning Working Group - Parallelized/Distributed Machine Learning
Friday, 24 February 2017 -
15:00
Monday, 20 February 2017
Tuesday, 21 February 2017
Wednesday, 22 February 2017
Thursday, 23 February 2017
Friday, 24 February 2017
15:00
News and group updates
-
Michele Floris
(
CERN
)
Steven Randolph Schramm
(
Universite de Geneve (CH)
)
Lorenzo Moneta
(
CERN
)
Paul Seyfert
(
Universita & INFN, Milano-Bicocca (IT)
)
Sergei Gleyzer
(
University of Florida (US)
)
News and group updates
Michele Floris
(
CERN
)
Steven Randolph Schramm
(
Universite de Geneve (CH)
)
Lorenzo Moneta
(
CERN
)
Paul Seyfert
(
Universita & INFN, Milano-Bicocca (IT)
)
Sergei Gleyzer
(
University of Florida (US)
)
15:00 - 15:10
Room: 40/S2-C01 - Salle Curie
15:10
Internally-Parallelized Boosted Decision Trees
-
Andrew Mathew Carnes
(
University of Florida (US)
)
Internally-Parallelized Boosted Decision Trees
Andrew Mathew Carnes
(
University of Florida (US)
)
15:10 - 15:30
Room: 40/S2-C01 - Salle Curie
15:30
Rapid development platforms for machine learning
-
Andrew Lowe
(
Hungarian Academy of Sciences (HU)
)
Rapid development platforms for machine learning
Andrew Lowe
(
Hungarian Academy of Sciences (HU)
)
15:30 - 15:50
Room: 40/S2-C01 - Salle Curie
15:50
Distributed Deep Learning using Apache Spark and Keras (see materials)
-
Joeri Hermans
(
Maastricht University (NL)
)
Distributed Deep Learning using Apache Spark and Keras (see materials)
Joeri Hermans
(
Maastricht University (NL)
)
15:50 - 15:55
Room: 40/S2-C01 - Salle Curie
Data parallelism is an inherently different methodology of optimizing parameters. The general idea is to reduce the training time by having n workers optimizing a central model by processing n different shards (partitions) of the dataset in parallel. In this setting we distribute n model replicas over n processing nodes, i.e., every node (or process) holds one model replica. Then, the workers train their local replica using the assigned data shard. However, it is possible to coordinate the workers in such a way that, together, they will optimize a single objective during training and as a result, reduce the wall clock training time. There are several approaches to achieve this, and these will be discussed in greater detail in the materials below.
15:55
Parallelization in Machine Learning with Multiple Processes
-
Omar Andres Zapata Mesa
(
University of Antioquia & Metropolitan Institute of Technology
)
Gerardo gutierrez
(
ITM
)
Parallelization in Machine Learning with Multiple Processes
Omar Andres Zapata Mesa
(
University of Antioquia & Metropolitan Institute of Technology
)
Gerardo gutierrez
(
ITM
)
15:55 - 16:25
Room: 40/S2-C01 - Salle Curie
16:25
Minutes
Minutes
16:25 - 16:26
Room: 40/S2-C01 - Salle Curie