CHEP 07

Name: CHEP 07
Start: 2007-09-02T08:00:00+02:00
End: 2007-09-09T12:00:00+02:00
Location: Victoria, Canada

2–9 Sept 2007

Victoria, Canada

Europe/Zurich timezone

Please book accomodation as soon as possible.

Support

chep07-support@triumf.ca

End-to-End Network/Application Performance Troubleshooting Methodology

3 Sept 2007, 08:00

10h 10m

Victoria, Canada

Board: 89

poster Computer facilities, production grids and networking Poster 1

Dr Wenji Wu (FERMILAB)

The computing models for LHC experiments are globally distributed and grid-based. In such a computing model, the experiments’ data must be reliably and efficiently transferred from CERN to Tier-1 regional centers, processed, and distributed to other centers around the world. Obstacles to good network performance arise from many causes and can be a major impediment to the success of this complex, multi-tiered data grid. Factors that affect overall network/application performance exist on the network end systems themselves (application software, operating system, hardware), in the local area networks that support the end systems, and within the wide area networks. Since the computer and network systems are globally distributed, it can be very difficult to locate and identify the factors that are hurting application performance. In this paper, we present an end-to-end network/application performance troubleshooting methodology developed and in use at Fermilab. The core of our approach is to narrow down the problem scope with a divide and conquer strategy. The overall complex problem is split into two distinct sub-problems: network end system diagnosis and tuning, and network path analysis. After satisfactorily evaluating, and if necessary resolving, each sub-problem, we conduct end-to-end performance analysis and diagnosis. The paper will discuss tools we use as part of the methodology. The long term objective of the effort is to enable end users to conduct much of the troubleshooting themselves, before (or instead of) calling upon network and end system “wizards,” who are always in short supply.

Dr Wenji Wu (FERMILAB)

Mr Andrey Bobyshev (FERMILAB) Mr Don Petravick (FERMILAB) Mr Mark Bowden (FERMILAB) Dr Matt Crawford (FERMILAB) Mr Maxim Grigoriev (FERMILAB) Mr Philip DeMar (FERMILAB) Mr Vyto Grigaliunas (FERMILAB)

Paper

CHEP07_247_Performance.pdf

CHEP 07

Support

End-to-End Network/Application Performance Troubleshooting Methodology

Victoria, Canada

Speaker

Description

Author

Co-authors

Presentation materials

Choose timezone

CHEP 07

Support

Speaker

Description

Author

Co-authors

Presentation materials