ATLAS UK Cloud Support

Europe/London
Zoom

Zoom

Tim Adye (Science and Technology Facilities Council STFC (GB)), James William Walder (Science and Technology Facilities Council STFC (GB))
Description

https://cern.zoom.us/j/98434450232

Password protected (same as (new) OPs Mtg)

Videoconference
ATLAS UK Cloud Support
Zoom Meeting ID
98434450232
Host
James William Walder
Useful links
Join via phone
Zoom URL

● Outstanding tickets

    • 153550 TEAM atlas UKI-SOUTHGRID-RALPP less urgent NGI_UK on hold 2021-09-01 15:56:00 Transfer failure at UKI-SOUTHGRID-RALPP with “Failed to select pool: All pools are full\n” error WLCG

      • On hold to collect stats / data
    • 153414 TEAM atlas UKI-SCOTGRID-ECDF less urgent NGI_UK in progress 2021-09-03 14:18:00 UKI-SCOTGRID-ECDF: Low transfer efficiency due to TRANSFER ERROR: Copy failed with mode 3rd pull, wi… WLCG

      • Closed
    • 153367 TEAM atlas RAL-LCG2 urgent NGI_UK in progress 2021-08-04 11:55:00 HTTPS on RAL CTA WLCG

      • Needs a comment
    • 153277 TEAM atlas UKI-SCOTGRID-GLASGOW less urgent NGI_UK in progress 2021-09-03 14:25:00 UKI-SCOTGRID-GLASGOW_CEPH job stage-in failures WLCG

      • Will close once Panda is back (hopefully)

● CPU

    • RAL

      • Variations in slots caused by (generally?) ATLAS submissions
    • Northgrid

      • LANCS running sim; but not enough slots
        • Attempting to change some of the Associated Params in CRIC to see if more work is possible
        • New SEs and networking being discussed
        • Currently using Rocky for test work
    • London

      • Black hole WN caused errors
    • SouthGrid

      • Oxford Xcache switching set site offline for a while; some power work on portion of nodes.
        • Running fine now
    • Scotgrid

      • A couple of dips, likely ATLAS related
  • Other new issues / tasks

    • BHAM - ATLAS and BHAM working on improved setup for VAC communications

 

 


● Ongoing Items

  • CentOS7 - Sussex

    • Post-Mtg ‘hackathon’ (see AOB)
  • TPC with http

    • WebDav enabled on RAL and ATLAS running in passive mode
      • OK, but with low load
  • Storageless Site test (Oxford)

    • Xcache disabled for ATLAS due to issue in xrootd 5.3

 

 


● News round-table

  • Dan
    • NTR
  • Gerard
    • NTR
  • JW
    • NTR
  • Matt
    • NTR
  • Patrick
    • NTR
  • Peter
    • Interested in the brokerage for jobs at LANCS
  • Sam
    • NTR
  • Vip
    • NTR

 

 


● AOB

AOB

  • Some discussion on CERN’s status regarding future linux versions

SUSSEX Hackathon

  • Reviewed and updated user namespace settings
    • re(?)-applied correct settings;
    • need to check that something else doesn’t clobber them again …
  • Looked at the Parallel options: openmp might be the closest to matching correct setup.
  • Awaiting to be set back online to review the progress

 

 

There are minutes attached to this event. Show them.
    • 10:00 10:20
      Status 20m
      • Outstanding tickets 10m
          • 153550 TEAM atlas UKI-SOUTHGRID-RALPP less urgent NGI_UK on hold 2021-09-01 15:56:00 Transfer failure at UKI-SOUTHGRID-RALPP with “Failed to select pool: All pools are full\n” error WLCG

            • On hold to collect stats / data
          • 153414 TEAM atlas UKI-SCOTGRID-ECDF less urgent NGI_UK in progress 2021-09-03 14:18:00 UKI-SCOTGRID-ECDF: Low transfer efficiency due to TRANSFER ERROR: Copy failed with mode 3rd pull, wi… WLCG

            • Closed
          • 153367 TEAM atlas RAL-LCG2 urgent NGI_UK in progress 2021-08-04 11:55:00 HTTPS on RAL CTA WLCG

            • Needs a comment
          • 153277 TEAM atlas UKI-SCOTGRID-GLASGOW less urgent NGI_UK in progress 2021-09-03 14:25:00 UKI-SCOTGRID-GLASGOW_CEPH job stage-in failures WLCG

            • Will close once Panda is back (hopefully)
      • CPU 5m

        New link for the site-oriented dashboard

          • RAL

            • Variations in slots caused by (generally?) ATLAS submissions
          • Northgrid

            • LANCS running sim; but not enough slots
              • Attempting to change some of the Associated Params in CRIC to see if more work is possible
              • New SEs and networking being discussed
              • Currently using Rocky for test work
          • London

            • Black hole WN caused errors
          • SouthGrid

            • Oxford Xcache switching set site offline for a while; some power work on portion of nodes.
              • Running fine now
          • Scotgrid

            • A couple of dips, likely ATLAS related
        • Other new issues / tasks

          • BHAM - ATLAS and BHAM working on improved setup for VAC communications

         

         

      • Other new issues / tasks 5m

        BHAM - Fixed; Site and ATLAS discussing a 'vobox' like system for the future.

    • 10:20 10:40
      Ongoing Items 20m
      • CentOS7 - Sussex

        • Post-Mtg ‘hackathon’ (see AOB)
      • TPC with http

        • WebDav enabled on RAL and ATLAS running in passive mode
          • OK, but with low load
      • Storageless Site test (Oxford)

        • Xcache disabled for ATLAS due to issue in xrootd 5.3

       

       

    • 10:40 10:50
      News round-table 10m
      • Dan
        • NTR
      • Gerard
        • NTR
      • JW
        • NTR
      • Matt
        • NTR
      • Patrick
        • NTR
      • Peter
        • Interested in the brokerage for jobs at LANCS
      • Sam
        • NTR
      • Vip
        • NTR

       

       

    • 10:50 11:00
      AOB 10m

      AOB

      • Some discussion on CERN’s status regarding future linux versions

      SUSSEX Hackathon

      • Reviewed and updated user namespace settings
        • re(?)-applied correct settings;
        • need to check that something else doesn’t clobber them again …
      • Looked at the Parallel options: openmp might be the closest to matching correct setup.
      • Awaiting to be set back online to review the progress