Computing in High Energy and Nuclear Physics (CHEP) 2012

Name: Computing in High Energy and Nuclear Physics (CHEP) 2012
Start: 2012-05-21T06:00:00-04:00
End: 2012-05-25T18:00:00-04:00
Location: New York City, NY, USA

21–25 May 2012

New York City, NY, USA

US/Eastern timezone

Support

chep2012@bnl.gov

Using Hadoop File System and MapReduce in a small/medium Grid site

22 May 2012, 13:30

4h 45m

Rosenthal Pavilion (10th floor) (Kimmel Center)

Rosenthal Pavilion (10th floor)

Kimmel Center

Poster Computer Facilities, Production Grids and Networking (track 4) Poster Session

Hassen Riahi (Universita e INFN (IT))

Data storage and access represent the key of CPU-intensive and data-intensive high performance Grid computing. Hadoop is an open-source data processing framework that includes, fault-tolerant and scalable, distributed data processing model and execution environment, named MapReduce, and distributed file system, named Hadoop distributed file system (HDFS). HDFS was deployed and tested within the Open Science Grid (OSG) middleware stack. Efforts have been taken to integrate HDFS with gLite middleware. We have tested the file system thoroughly in order to understand its scalability and fault-tolerance while dealing with small/medium site environment constraints. To benefit entirely from this file system, we made it working in conjunction with Hadoop Job scheduler to optimize the executions of the local physics analysis workflows. The performance of the analysis jobs which used such architecture seems to be promising, making it useful to follow up in the future.

Student? Enter 'yes'. See http://goo.gl/MVv53	yes

Hassen Riahi (Universita e INFN (IT))

Poster

CHEPposter.pdf

Computing in High Energy and Nuclear Physics (CHEP) 2012

Support

Using Hadoop File System and MapReduce in a small/medium Grid site

Rosenthal Pavilion (10th floor)

Kimmel Center

Speaker

Description

Author

Presentation materials