Lorenzo Moneta
(CERN),
Michele Floris
(CERN),
Paul Seyfert
(Universita & INFN, Milano-Bicocca (IT)), Dr
Sergei Gleyzer
(University of Florida (US)),
Steven Randolph Schramm
(Universite de Geneve (CH))
24/02/2017, 15:00
Andrew Mathew Carnes
(University of Florida (US))
24/02/2017, 15:10
Dr
Andrew Lowe
(Hungarian Academy of Sciences (HU))
24/02/2017, 15:30
Joeri Hermans
(Maastricht University (NL))
24/02/2017, 15:50
Data parallelism is an inherently different methodology of optimizing parameters. The general idea is to reduce the training time by having n workers optimizing a central model by processing n different shards (partitions) of the dataset in parallel. In this setting we distribute n model replicas over n processing nodes, i.e., every node (or process) holds one model replica. Then, the workers...
Gerardo gutierrez
(ITM),
Omar Andres Zapata Mesa
(University of Antioquia & Metropolitan Institute of Technology)
24/02/2017, 15:55