21-25 May 2012
New York City, NY, USA
US/Eastern timezone

Big data log mining: the key to efficiency

22 May 2012, 13:30
4h 45m
Rosenthal Pavilion (10th floor) (Kimmel Center)

Rosenthal Pavilion (10th floor)

Kimmel Center

Poster Distributed Processing and Analysis on Grids and Clouds (track 3) Poster Session


Paul Rossman (Fermi National Accelerator Laboratory (FNAL))


In addition to the physics data generated each day from the CMS detector, the experiment also generates vast quantities of supplementary log data. From reprocessing logs to transfer logs this data could shed light on operational issues and assist with reducing inefficiencies and eliminating errors if properly stored, aggregated and analyzed. The term "big data" has recently taken the spotlight with organizations worldwide using tools such as CouchDB, Hadoop and Hive. In this paper we present a way of evaluating the capture and storage of log data from various experiment components to provide analytics and visualization in near real time.

Primary author

Paul Rossman (Fermi National Accelerator Laboratory (FNAL))

Presentation Materials

There are no materials yet.