DPF 2011

Name: DPF 2011
Start: 2011-08-08T18:00:00-04:00
End: 2011-08-13T16:00:00-04:00
Location: Rhode Island Convention Center

8–13 Aug 2011

Rhode Island Convention Center

US/Eastern timezone

Indico Support

dpf2011@brown.edu

ATLAS Analysis Data Distribution and Panda PD2P

12 Aug 2011, 11:10

20m

552 B (Rhode Island Convention Center)

552 B

Rhode Island Convention Center

Parallel contribution Computing in HEP Computing in HEP

Dr Alden Stradling (UT Arlington)

The PanDA Distributed Analysis system has been used in the ATLAS collaboration and beyond as a resilient and scalable distributed processing and analysis system. Using a central pull and distributed push (pilot job) model for task definition and job tracking, it integrates with many kinds of local batch system, data management software, and security models. One of the principal challenges in making user jobs responsive comes from data location -- since jobs go to the data, popular datasets at limited numbers of locations will attract too much user activity for the site's resources. The data are too large, however, to pre-position at all sites. PanDA has pioneered an approach to data management integration called P2DP, which automates data distribution to user analysis sites based on usage and popularity of particular datasets. By tuning the parameters that trigger these data replications, we optimize the balance between the data replication and user concentration. The strengths and tradeoffs of both the PanDA pilot and the P2DP model will be discussed, and we will examine throughput and efficiency, security versus flexibility, and the ongoing process of tuning the system to be more responsive and intelligent.

Dr Alden Stradling (UT Arlington)

Slides

20110812_PD2P_Brown.pdf

DPF 2011

Indico Support

ATLAS Analysis Data Distribution and Panda PD2P

552 B

Rhode Island Convention Center

Speaker

Description

Author

Presentation materials

Choose timezone

DPF 2011

Indico Support

Speaker

Description

Author

Presentation materials