IT-protoDUNE coordination (Single Phase and Double Phase)

Europe/Zurich
513/R-070 - Openlab Space (CERN)

513/R-070 - Openlab Space

CERN

15
Show room on map
Xavier Espinal Curull (CERN)
Description
Coordination: JIRA, Minutes, etc. https://twiki.cern.ch/twiki/bin/viewauth/ProDUNEIT/WebHome
Videoconference Rooms
IT-protoDUNE_coordination__Single_Phase_and_Double_Phase_
Name
IT-protoDUNE_coordination__Single_Phase_and_Double_Phase_
Description
IT protoDUNE
Extension
10612962
Owner
Ignacio Coterillo Coz
Auto-join URL
Useful links
Phone numbers

 Room: Elisabetta, Xavi, Ignacio
 Remote: Denis, Steve, Geoff

## Round table
 
 3 meetings during the last couple of months
    - data management
    - compute model
    - collaboration meeting

Compute model: Top level discussios regarding datacenters, power targets,
   workflows, data reduction, etc. We got the data models

 Interested in EOS Token testing.
 X: Different storage providers are trying to adapt to this (e.g dCache).
 From EOS developers this should be possible to be tested in ~1 month
 From the WLCG PoV we need to enforce that all the components understand
 the same type of tokens.

 X: Another important point is the LHC1 - Fermilab trusted network which is being
 setup.

 X: Some I missed the last 2 months in Dune governance?
 - Singularity work going smooth.
 - All Single-Phase items going correctly
 
 EOS timeouts during the last week, or running very slow
 X: Faulty card in router affected big chunk of the CCC

### NP02

  Elisabetta: The JSON file issues detected yesterday was identified and fixed.
  The root cause was coming from the modification of the objects to include the
  number of events

  Yesterday there was a power cut in Prevessin which affected all DAQ machines
  -> Several hours to recover (no UPS available). Agreed with Filippo this situation
  is not acceptable and needs to e addressed. (The error was restricted to Prevessin)

#### NP02 Slides

X: Regarding the ~1% failure rate in the transfers via link EHN1-> IT it would
be interested if the errors have some pattern /trend in the long term to locate
the root cause

Grafana Dashboards: Running locally, exposed to Lyon via SSH tunnel

(Ignacio: Add note to the MONIT config and dashboards)

Steve: When do you think you would be ready for compressing this data?
 Not in the short time, but when we compress the size will be the same, we'll just
 increase the number of events.
 For processing, it should be ready in a couple of weeks

Is this the data in np02/reco ? Do you want to copy it to Fermilab?
S: In the next couple of weeks it will be added to Rucio, I'll contact you regarding that

X: Having the data on tape, it can be safely deleted from EOS (as we are recomending now
to use it as staging area)

S: What would the time line for CTA?
X: CTA is moving the big 4 experiments before the beginning of the RUN3. Probably the public
EOS instance will still be running on Castor until LS3.
For experiments relying on FTS and Rucio, the change is just a matter of endpoint name, it
doesn't really affect you.

S: How close are you to the operational voltage?
E: We are close to 3KV, and testing different stable setups to solve the problem of the
bubbles mentioned in the slides to go out of the High Pressure cycles because they can't
be maintained for long and require lots of preparation/downtime to recover normal pressure.

#### Geoff

- Plan to run for two weeks with the cosmic ray tagger on (detector addon). Maybe on November
- Will top LAr with Xe and study the effects
- Converting readout to FELIX (Atlas platform, modified to be use for DUNE)

#### Ignacio

New Linux Contact: Alejandro Iribarren

Tentative date: 7 November

There are minutes attached to this event. Show them.
    • 15:00 15:05
      Coordination update 5m
      Speaker: Xavier Espinal Curull (CERN)
    • 15:20 16:00
      Round-table 40m
      Speaker: All