20th International Conference on Computing in High Energy and Nuclear Physics (CHEP2013)

Name: 20th International Conference on Computing in High Energy and Nuclear Physics (CHEP2013)
Start: 2013-10-14T09:00:00+02:00
End: 2013-10-18T13:00:00+02:00
Location: Amsterdam, Beurs van Berlage

14–18 Oct 2013

Amsterdam, Beurs van Berlage

Europe/Amsterdam timezone

CHEP2013 Logistics Management

info@chep2013.org

Solving Small Files Problem in Enstore

14 Oct 2013, 15:00

45m

Grote zaal (Amsterdam, Beurs van Berlage)

Grote zaal

Amsterdam, Beurs van Berlage

Poster presentation Data Stores, Data Bases, and Storage Systems Poster presentations

Dr Alexander Moibenko (Fermi NAtiona Accelerator Laboratoy)

Enstore is a tape based Mass Storage System originally designed for Run II Tevatron experiments at FNAL (CDF, D0). Over the years it has proven to be reliable and scalable data archival and delivery solution, which meets diverse requirements of variety of applications including US CMS Tier 1, High Performance Computing, Intensity Frontier experiments as well as data backups. Data intensive experiments like CDF, D0 and US CMS Tier 1 generally produce huge amount of data stored in files with the average size of few Gigabytes, which is optimal for writing and reading data to/from tape. In contrast, much of the data produced by Intensity Frontier experiments, Lattice QCD and Cosmology is sparse, resulting in accumulation of large amounts of small files. Reliably storing small files on tape is inefficient due to file marks writing which takes significant amount of the overall file writing time (few seconds). There are several ways of improving data write rates, but some of them are unreliable, some are specific to the type of tape drive and still do not provide transfer rates adequate to rates offered by tape drives (20% of the drives potential rate). In order to provide good rates for small files in a transparent and consistent manner, the Small File Aggregation (SFA) feature has been developed to provide aggregation of files into containers which are subsequently written to tape. The file aggregation uses reliable internal Enstore disk buffer. File grouping is based on policies using file metadata and other user defined steering parameters. If a small file, which is a part of a container, is requested for read, the whole container is staged into internal Enstore read cache thus providing a read ahead mechanism in anticipation of future read requests for files from the same container. SFA is provided as service implementing file aggregation and staging transparently to user. The SFA is has been successfully used since April 2012 by several experiments. Currently we are preparing to scale up write/read SFA cache. This paper describes Enstore Small Files Aggregation feature and discusses how it can be scaled in size and transfer rates.

Dr Alexander Moibenko (Fermi NAtiona Accelerator Laboratoy)

Mr Alexander Kulyavtsev (FNAL) Dmitry Litvintsev (FNAL) Dr Gene Oleynik (Fermilab) John Hendry (FNAL) Stan Naymola (FNAL)

There are no materials yet.

20th International Conference on Computing in High Energy and Nuclear Physics (CHEP2013)

CHEP2013 Logistics Management

Solving Small Files Problem in Enstore

Grote zaal

Amsterdam, Beurs van Berlage

Speaker

Description

Primary author

Co-authors

Presentation materials

Choose timezone

20th International Conference on Computing in High Energy and Nuclear Physics (CHEP2013)

CHEP2013 Logistics Management

Speaker

Description

Primary author

Co-authors

Presentation materials

Share this page

Direct link

Social networks

Calendaring