EOS DevOps Meeting

Europe/Zurich
513/R-068 (CERN)

513/R-068

CERN

19
Show room on map
Jan Iven (CERN)
Description
Weekly meeting to discuss progress on EOS rollout

● production instances

EOSALICE

Updated to 4.2.0 this morning

  • EOS-2017 is still here (FSCK crashes the MGM - fixed but need release)

EOSCMS

Preparation for EOY closure:

  • CMS plans to add 80k CPU from the HLT, this will likely require us either to whitelist their (additional) gateway nodes (which use GPN to talk to MGM, but directly to FST..), or the make the Auth Proxies more robust
    • one issue with auth proxies has been fixed.

Batch on EOS

Preparing a bunch of nodes in QA/EOSPPS to serve as a testbed (managed by us, to see the operational implications)


● CERNBOX and EOSUSER

running out of memory, expect crash in the next days. Trying to have new (big-mem but slower CPU, less disk and only SSD - and apparently cannot add more spinning disks) box in production, but Luca leaves for Friday..

  • "slower CPU": booting NS took 2200sec (on idle machine). Previous NS boot (70min) suffered from THP (transparent huge pages) and was being hammered by client.
  • 4x 1TB SSD means RAID1 (no feeling/data for reliability), reduced space for logs

Hugo now has website is readonly mode (instead of being unavailable), have separate alias so that FUSE clients can stay connected.

 


● FUSE and client versions

eos-fuse-4.2.0-3 in "production" since Nov 1st, but still frequent crashes

  • jemalloc: EOS-2054 has unreleased fix - ETA for new release? No, does not fix the crashes.
  • AuthIdManager::CleanupThread : EOS-1920 / EOS-2080
  • EOS-2081 "somewhere in libc" - waiting for backtraces
  • (tried to create JIRA tickets for most other crashes observed, based on library+base address - these will be filled out once we have actual backtraces.)

Small eos-cleanup.sh change in qa. Prevents an lsof deadlock. CRM-2470


WIP puppet-eosclient support for eosxd can be seen here: https://gitlab.cern.ch/ai/it-puppet-module-eosclient/merge_requests/53


Deadlock on autofs triggered eosxd mount: EOS-2095


● Citrine rollout

EOSATLAS

Citrine migration planned for 9th January, will send announcement shortly. Foreseen downtime < ~4h

EOSCMS: no date yet.


● nextgen FUSE

  • Dan starts working on puppet module.
  • Suggest "test plan" ("who tests which area") and guidelines for getting as many automatic tests as possible from occasional testers.
    • Massimo would like "a couple of machines" to play around
    • Suggest to have Rainer look at the client cache.
       
  • Dev Status

    Fixed code in Aquamarine, started merge now into CITRINE, need to run certification script, then green light ...

- new certification script:

 

====================================================================
--- ... working-dir = /eos/dev/fuse/certify/certify.10474
====================================================================
001 ... fusex-benchmark

real    0m20.818s
user    0m0.117s
sys    0m1.113s
====================================================================
002 ... rename-test
====================================================================
003 ... git-clone-test

real    0m37.038s
user    0m7.819s
sys    0m2.359s
====================================================================
004 ... xrootd-compilation

real    0m44.122s
user    1m59.250s
sys    0m16.458s

real    1m0.853s
user    2m0.487s
sys    0m17.337s
====================================================================
005 ... client-tests
005a... micro-tests
eos.clients.fuse.dev.microtests.touch_ms 7.185 1510060804
eos.clients.fuse.dev.microtests.rm_ms 5.350 1510060804
eos.clients.fuse.dev.microtests.sqlite_100_inserts_ms 856.418 1510060804
eos.clients.fuse.dev.microtests.touch100files_parallel_ms 5.619 1510060805
eos.clients.fuse.dev.microtests.rm_100_files_ms 5.339 1510060805
eos.clients.fuse.dev.microtests.untar_ms 799.089 1510060805
eos.clients.fuse.dev.microtests.dd_4m_ms 32.118 1510060806
eos.clients.fuse.dev.microtests.dd_4m_dsync_ms 875.729 1510060806
eos.clients.fuse.dev.microtests.dd_4m_read_ms 8.863 1510060807
eos.clients.fuse.dev.microtests.dd_4m_read_direct_ms 612.384 1510060807
eos.clients.fuse.dev.microtests.dd_4k_ms 8.887 1510060807
eos.clients.fuse.dev.microtests.dd_4k_dsync_ms 9.875 1510060807
eos.clients.fuse.dev.microtests.dd_4k_read_ms 5.852 1510060807
eos.clients.fuse.dev.microtests.dd_4k_read_direct_ms 9.086 1510060807
eos.clients.fuse.dev.microtests.rndmseekwrite_ms 138.519 1510060807
eos.clients.fuse.dev.microtests.fwseekwrite_ms 545.669 1510060807
eos.clients.fuse.dev.microtests.untar_940_files_ms 2388.796 1510060808
eos.clients.fuse.dev.microtests.f77uf_ms 3.585 1510060810
eos.clients.fuse.dev.microtests.multiopen_fortran_gf_ms 2588.663 1510060810
eos.clients.fuse.dev.microtests.multiopen_fortran_i_ms 151120.621 1510060813
eos.clients.fuse.dev.microtests.git_clone_ms 2353.397 1510060964
005b... zlib-compile
005c... git-clone
005d... rsync
005d... sqlite

Note: (Some of the tests run FSYNC but should not.)

  • will merge to citrine, re-run these tests, then release (and that then is OK to run other tests on). ETA tomorrow.

EOSUAT runs a version with messed up quota support need to be updated to latest Aquamarine build.

 

 

 

 

 

 


● new Namespace

Last week meeting to decide the best strategy for MGM rollout (AP, CC, HR, ML, LM)

Working decision: 1 (unsplit, non-HA) MGM for EOSUSER with QDB backend (3 or 5 nodes?)

Task-force style effort between ops (LM, HR, CC, ...) and dev (ES, GB, ...) to coordinate rollout.

 

Elvin is looking at the EOSBACKUP conversion tool (1h30), runs out of some resource waiting for ack (tuneable). Will do conversion for all production namespaces, as they are all different..


● BATCH integration

Usual test...

Task 26135 starts at Tue Nov  7 11:27:13 2017 and ends at Tue Nov  7 12:19:21 2017 (52.1 minutes)
Analysed jobs: 100
Correct jobs: 100
Maximum concurrency: 1
Execution hosts (top 5):  b6c0fb38a7 [#28]  b69586e854 [#23]  b60c691f69 [#22]  b678940021 [#17]  b626536183 [#10] 
Execution environments (top 5): eos-client-4.2.0-3.el6.x86_64, eos-fuse-core-4.2.0-3.el6.x86_64, 
xrootd-client-libs-4.7.0-1.el6.i686, xrootd-client-libs-4.7.0-1.el6.x86_64 [#100] 

● AOB

Small investigation of EOSUSER namespace

Out of curiosity (using a dump from Yolanda) I checked the effect of deduplication (file level). Please note deduplication is *not* a prioritiy IMO.

Input: 396 M files (Early October)

Consider only files >10MB

Use AD32 (as recorded in the catalogue). With "large  files", AD32 collisions are not too many collisions.

Dedup saving ~15%  (188 TB out of 1184). I am shocked but I do not find any loophole

Anyway: top files being "repeated":

- cernbox/smashbox testing (e.g. file "c1857a3c" has 20k replicas for a total of 1.1 TB). They are characterised by a flat time distribution (test executed every x hours).

- File 05f0347e is an output of a job (2.9 GB x 374 copies). Suboutputs of single job (all equals...).    Time distr concentrated well within an hour.

- Similar cases exists where 1 file is in the user dir and all the others are in the trash (again, rather peaked time distribution.

 

Q: de-duplication effect on number of files? Not looked.


Q; is SWAN now using the new unified principals (since will play with instances)? will check (Enrico: looks OK)

There are minutes attached to this event. Show them.
    • 16:00 16:05
      overall 2017 planning 5m
      Speaker: Jan Iven (CERN)
    • 16:05 16:30
      operations: production
      • 16:05
        production instances 5m
        Speaker: Herve Rousseau (CERN)

        EOSALICE

        Updated to 4.2.0 this morning

        • EOS-2017 is still here (FSCK crashes the MGM - fixed but need release)

        EOSCMS

        Preparation for EOY closure:

        • CMS plans to add 80k CPU from the HLT, this will likely require us either to whitelist their (additional) gateway nodes (which use GPN to talk to MGM, but directly to FST..), or the make the Auth Proxies more robust
          • one issue with auth proxies has been fixed.

        Batch on EOS

        Preparing a bunch of nodes in QA/EOSPPS to serve as a testbed (managed by us, to see the operational implications)

      • 16:10
        CERNBOX and EOSUSER 5m
        Speaker: Luca Mascetti (CERN)

        running out of memory, expect crash in the next days. Trying to have new (big-mem but slower CPU, less disk and only SSD - and apparently cannot add more spinning disks) box in production, but Luca leaves for Friday..

        • "slower CPU": booting NS took 2200sec (on idle machine). Previous NS boot (70min) suffered from THP (transparent huge pages) and was being hammered by client.
        • 4x 1TB SSD means RAID1 (no feeling/data for reliability), reduced space for logs

        Hugo now has website is readonly mode (instead of being unavailable), have separate alias so that FUSE clients can stay connected.

         

      • 16:15
        FUSE and client versions 5m
        Speaker: Dan van der Ster (CERN)

        eos-fuse-4.2.0-3 in "production" since Nov 1st, but still frequent crashes

        • jemalloc: EOS-2054 has unreleased fix - ETA for new release? No, does not fix the crashes.
        • AuthIdManager::CleanupThread : EOS-1920 / EOS-2080
        • EOS-2081 "somewhere in libc" - waiting for backtraces
        • (tried to create JIRA tickets for most other crashes observed, based on library+base address - these will be filled out once we have actual backtraces.)

        Small eos-cleanup.sh change in qa. Prevents an lsof deadlock. CRM-2470


        WIP puppet-eosclient support for eosxd can be seen here: https://gitlab.cern.ch/ai/it-puppet-module-eosclient/merge_requests/53


        Deadlock on autofs triggered eosxd mount: EOS-2095

      • 16:20
        Citrine rollout 5m
        Speaker: Herve Rousseau (CERN)

        EOSATLAS

        Citrine migration planned for 9th January, will send announcement shortly. Foreseen downtime < ~4h

        EOSCMS: no date yet.

      • 16:25
        SWAN 5m
        Speaker: Jakub Moscicki (CERN)
    • 16:30 16:50
      development: near-term
      • 16:30
        nextgen FUSE 5m
        Speaker: Andreas Joachim Peters (CERN)
        • Dan starts working on puppet module.
        • Suggest "test plan" ("who tests which area") and guidelines for getting as many automatic tests as possible from occasional testers.
          • Massimo would like "a couple of machines" to play around
          • Suggest to have Rainer look at the client cache.
             
        • Dev Status

          Fixed code in Aquamarine, started merge now into CITRINE, need to run certification script, then green light ...

        - new certification script:

         

        ====================================================================
        --- ... working-dir = /eos/dev/fuse/certify/certify.10474
        ====================================================================
        001 ... fusex-benchmark

        real    0m20.818s
        user    0m0.117s
        sys    0m1.113s
        ====================================================================
        002 ... rename-test
        ====================================================================
        003 ... git-clone-test

        real    0m37.038s
        user    0m7.819s
        sys    0m2.359s
        ====================================================================
        004 ... xrootd-compilation

        real    0m44.122s
        user    1m59.250s
        sys    0m16.458s

        real    1m0.853s
        user    2m0.487s
        sys    0m17.337s
        ====================================================================
        005 ... client-tests
        005a... micro-tests
        eos.clients.fuse.dev.microtests.touch_ms 7.185 1510060804
        eos.clients.fuse.dev.microtests.rm_ms 5.350 1510060804
        eos.clients.fuse.dev.microtests.sqlite_100_inserts_ms 856.418 1510060804
        eos.clients.fuse.dev.microtests.touch100files_parallel_ms 5.619 1510060805
        eos.clients.fuse.dev.microtests.rm_100_files_ms 5.339 1510060805
        eos.clients.fuse.dev.microtests.untar_ms 799.089 1510060805
        eos.clients.fuse.dev.microtests.dd_4m_ms 32.118 1510060806
        eos.clients.fuse.dev.microtests.dd_4m_dsync_ms 875.729 1510060806
        eos.clients.fuse.dev.microtests.dd_4m_read_ms 8.863 1510060807
        eos.clients.fuse.dev.microtests.dd_4m_read_direct_ms 612.384 1510060807
        eos.clients.fuse.dev.microtests.dd_4k_ms 8.887 1510060807
        eos.clients.fuse.dev.microtests.dd_4k_dsync_ms 9.875 1510060807
        eos.clients.fuse.dev.microtests.dd_4k_read_ms 5.852 1510060807
        eos.clients.fuse.dev.microtests.dd_4k_read_direct_ms 9.086 1510060807
        eos.clients.fuse.dev.microtests.rndmseekwrite_ms 138.519 1510060807
        eos.clients.fuse.dev.microtests.fwseekwrite_ms 545.669 1510060807
        eos.clients.fuse.dev.microtests.untar_940_files_ms 2388.796 1510060808
        eos.clients.fuse.dev.microtests.f77uf_ms 3.585 1510060810
        eos.clients.fuse.dev.microtests.multiopen_fortran_gf_ms 2588.663 1510060810
        eos.clients.fuse.dev.microtests.multiopen_fortran_i_ms 151120.621 1510060813
        eos.clients.fuse.dev.microtests.git_clone_ms 2353.397 1510060964
        005b... zlib-compile
        005c... git-clone
        005d... rsync
        005d... sqlite

        Note: (Some of the tests run FSYNC but should not.)

        • will merge to citrine, re-run these tests, then release (and that then is OK to run other tests on). ETA tomorrow.

        EOSUAT runs a version with messed up quota support need to be updated to latest Aquamarine build.

         

         

         

         

         

         

      • 16:35
        new Namespace 5m
        Speaker: Elvin Alin Sindrilaru (CERN)

        Last week meeting to decide the best strategy for MGM rollout (AP, CC, HR, ML, LM)

        Working decision: 1 (unsplit, non-HA) MGM for EOSUSER with QDB backend (3 or 5 nodes?)

        Task-force style effort between ops (LM, HR, CC, ...) and dev (ES, GB, ...) to coordinate rollout.

         

        Elvin is looking at the EOSBACKUP conversion tool (1h30), runs out of some resource waiting for ack (tuneable). Will do conversion for all production namespaces, as they are all different..

    • 16:50 17:45
      other: pilot services, long-term dev, external
      • 16:50
        Webservice 5m
        Speaker: Luca Mascetti (CERN)
      • 16:55
        Backup 5m
        Speaker: Luca Mascetti (CERN)
      • 17:00
        Samba 5m
        Speaker: Luca Mascetti (CERN)
      • 17:05
        $HOME structure 5m
        Speaker: Luca Mascetti (CERN)
      • 17:10
        BATCH integration 5m
        Speaker: Massimo Lamanna (CERN)
        Usual test...
        
        Task 26135 starts at Tue Nov  7 11:27:13 2017 and ends at Tue Nov  7 12:19:21 2017 (52.1 minutes)
        Analysed jobs: 100
        Correct jobs: 100
        Maximum concurrency: 1
        Execution hosts (top 5):  b6c0fb38a7 [#28]  b69586e854 [#23]  b60c691f69 [#22]  b678940021 [#17]  b626536183 [#10] 
        Execution environments (top 5): eos-client-4.2.0-3.el6.x86_64, eos-fuse-core-4.2.0-3.el6.x86_64, 
        xrootd-client-libs-4.7.0-1.el6.i686, xrootd-client-libs-4.7.0-1.el6.x86_64 [#100] 
      • 17:15
        Xrootd 5m
        Speaker: Michal Kamil Simon (CERN)
      • 17:20
        AOB 5m

        Small investigation of EOSUSER namespace

        Out of curiosity (using a dump from Yolanda) I checked the effect of deduplication (file level). Please note deduplication is *not* a prioritiy IMO.

        Input: 396 M files (Early October)

        Consider only files >10MB

        Use AD32 (as recorded in the catalogue). With "large  files", AD32 collisions are not too many collisions.

        Dedup saving ~15%  (188 TB out of 1184). I am shocked but I do not find any loophole

        Anyway: top files being "repeated":

        - cernbox/smashbox testing (e.g. file "c1857a3c" has 20k replicas for a total of 1.1 TB). They are characterised by a flat time distribution (test executed every x hours).

        - File 05f0347e is an output of a job (2.9 GB x 374 copies). Suboutputs of single job (all equals...).    Time distr concentrated well within an hour.

        - Similar cases exists where 1 file is in the user dir and all the others are in the trash (again, rather peaked time distribution.

         

        Q: de-duplication effect on number of files? Not looked.


        Q; is SWAN now using the new unified principals (since will play with instances)? will check (Enrico: looks OK)