EOS DevOps Meeting

Name: EOS DevOps Meeting
Start: 2017-12-12T16:00:00+01:00
End: 2017-12-12T17:45:00+01:00
Location: CERN

Tuesday 12 Dec 2017, 16:00 → 17:45 Europe/Zurich

513/R-068 (CERN)

513/R-068

CERN

Show room on map

Jan Iven (CERN)

Description

Weekly meeting to discuss progress on EOS rollout

Hide

● production instances

EOSATLAS

Crash end of last week, headnode running out of memory + a user doing nasty things.

now back to "reasonable" mem usage.

EOSCMS

+2PB usable to be added:

by harvesting some nodes from EOSALICE (to be checked with Roberto)
EOSLHCB and EOSATLAS also have some surplus capacity

Timeframe: urgent (confirmed by Bernd)

● CERNBOX and EOSUSER

Doing MD5 scan (for sec team).

Today backup caused trouble on EOSUSER (NO_CONTACT, call from operator)

● FUSE and client versions

puppet-eosclient support for eosxd going into production tomorrow.

(puppet stdlib has been rolled back - except for the things this module needs).

● Citrine rollout

EOSALICE+EOSPUBLIC

Very high activity on both instances, leading to crash because of known bug and FD exhaustion.

Required to use XRootD 4.8.0-rc1 (now on -rc2) that increases the 32k limit to 64.
This helped quite a bit.

In the meantime the origin of the sudden load increase on EOSPUBLIC has been identified and mitigated (thanks to ALICE computing team, was a fallback location).

Unfortunately, there's a regression in XRootD 4.8 that prevents headnodes from talking to each other (but failover still works just fine). Confirmed auth issue, under investigation.

Q: Can we somehow use the EOS test infrastructure to provide better testing to XRootd?

Q: what is strategy to go >> 64k filedescriptors? Being looked up by Xrootd team/Andy, these may internal to Xrootd things (with a fixed size memory structure - was signed short, should be easy to change).

● SWAN

User reported an issue (cannot save notebooks) due to FST being full.

Will update to 4.2.4 on SWAN (client side change to deal with this, launch converter), and in parallel Luca cleaned up the affected disk.

A: EOSFUSEX should also get the same behaviour.

● nextgen FUSE

CLIENT

tagged 4.2.5 today
- not useful to tag a version before visible issues are sorted out
- took me 3 days to figure out the Rainer problem (building EOS rpm with eosxd)
  - three main issues:
    - removing MD records to early before they could be executed upstream (e.g. deletions) resulting in ENOTEMPTY messages
    - different behaviour of the kernel cache on EL7 resulting in 'no such file or directory' during compilation
    - recreation of identical name after deletion has to wait for asynchronous deletion
  - added the Rainer test to eos-fusex-certify script
    - tested on El7 and SLC6
  - inode 1 bug
    - when you mount 'your' home directory path, the top level directory of the local mount has inode 1
      - when the cache capability expires for the first time all directories in the top level mount became invisible for ever. Same happened when the first call on the top level directory was not 'ls'

eosxd bugs fixed 4.2.5
* [EOS-2146] - symlinks have to show the size of the target string
* [EOS-2147] - listxattr creates SEGV on OSX
* [EOS-2148] - eosxd on OSX creates empty file when copying with 'cp'
* [EOS-2159] - An owner of a directory has to get always chmod permissions
* [EOS-2161] - rm -rf on fusex mount fails to remove all files/subdirectories
* [EOS-2174] - Running out of FDs when using a user mount

SERVER

tagged 0.3.270 today (problem with the build system to be sorted by Joszef)
- made server open atomic

AOB

- implementation of listing of large directories has to be changed on server side to keep NS locks only for 10k entries and then re-lock (avoid write starvation when Massimo lists 10M dirs inside a dir)

- when MGM is down when eosxd mounts, the XrdCl object never tries to re-establish the conneciton, although eosxd replays commands according to the local timeout configuration

- UAT and PPS should be updated to the tagged versions (-> Luca)

- Need to check the YUM repo for aquamarine releases (point to storage-ci, not dss-ci).

- can now limit access to domains ("cern.ch") à la AFS (BE software distribution).

● new Namespace

New-catalogue tests (Massimo)

EOSPPS, xroot:4.8.0-0.rc1, fusex: 4.8.0-0.rc1

Client machines in Wigner (eospluswig701.cern.ch) to minimise the MGM-client latency

Xmas tree (ladder of directories 1000 level, 5000 directories):

This test uses FUSEX as well
Factor of 3 faster than old FUSE (both creation and rm -fr) with client and server in Meyrin
Comparable to AFS (client and server in Meyrin)):

#-bash-4.2$ ./ladder.py /eos/pps/users/laman #ladder is going to create 5050 directories #Dir creation: 5.528661 s (5050 dirs) AFS erratic between 10 s and 23 s #Dir removal: 15.508470 s (5050 dirs) AFS erratic between 8 s and 26 s

Large directory

This test is based on eos commands (command line and via python binding)
- One dir is created containing a lot of subdir on the same level (d/1, d/2, ... ,d/1000000)
mkdir runs at ~13kHz
In the past, bad behavior of eos fileinfo d (when d contains >>1M files)
- 10^5 files
  - eos mkdir for the last dir takes 22 ms
  - eos ls takes 1.8 s
  - eos rm -r takes 3.5 s
  - eos fileinfo d directory: 17 ms
- 1M files
  - eos mkdir for the last dir takes 30 ms
  - eos ls takes 2.4 s (1M directories; output truncated at 50000), See AOB
  - eos rm -r takes 12.7 s (second test 10.9)
  - eos fileinfo d directory: 17 ms
- 10M files
  - eos mkdir for the last dir takes 36 ms
  - eos ls takes 4.8 s (1M directories; output truncated at 50000), See AOB
  - eos rm -r aborts (the dir is a unclear status: visible from fusex, not visible from els ls: (errc=2) (No such file or directory)
  - eos fileinfo d directory: 20 ms

Rate tests

Parallel mkdir from 4 (5) nodes. Single stream is ~1300 Hz, 4 streams is ~2700 Hz, 5 streams is ~ 2000 Hz.

AOB

Since (I understand) eos find runs server side, why do I get this?

-bash-4.2$ eos find --count /eos/pps/users/laman/largedir nfiles=0 ndirectories=50001 warning: find results are limited for you to ndirs=50000 - result is truncated! (errc=7) (Argument list too long)

(not an admin? need special powers for this)

● AOB

Kuba has created testcases, passed on Jozsef.
- should test more the multi-client
mount /scratch on PLUS/AIADM "qa" machines (Dan to do merge request).

There are minutes attached to this event. Show them.

- 1
  
  overall planning
  
  Speaker: Jan Iven (CERN)
- operations: production
  - 2
    production instances
    
    Minutes
    
    Speaker: Herve Rousseau (CERN)
    
    EOSATLAS
    
    Crash end of last week, headnode running out of memory + a user doing nasty things.
    
    now back to "reasonable" mem usage.
    
    EOSCMS
    
    +2PB usable to be added:
    
    by harvesting some nodes from EOSALICE (to be checked with Roberto)
    
    EOSLHCB and EOSATLAS also have some surplus capacity
    
    Timeframe: urgent (confirmed by Bernd)
  - 3
    
    CERNBOX and EOSUSER
    
    Minutes
    
    Speaker: Luca Mascetti (CERN)
    
    Doing MD5 scan (for sec team).
    
    Today backup caused trouble on EOSUSER (NO_CONTACT, call from operator)
  - 4
    
    FUSE and client versions
    
    Minutes
    
    Speaker: Dan van der Ster (CERN)
    
    puppet-eosclient support for eosxd going into production tomorrow.
    
    (puppet stdlib has been rolled back - except for the things this module needs).
  - 5
    
    Citrine rollout
    
    Minutes
    
    Speaker: Herve Rousseau (CERN)
    
    EOSALICE+EOSPUBLIC
    
    Very high activity on both instances, leading to crash because of known bug and FD exhaustion.
    
    Required to use XRootD 4.8.0-rc1 (now on -rc2) that increases the 32k limit to 64.
    This helped quite a bit.
    
    In the meantime the origin of the sudden load increase on EOSPUBLIC has been identified and mitigated (thanks to ALICE computing team, was a fallback location).
    
    Unfortunately, there's a regression in XRootD 4.8 that prevents headnodes from talking to each other (but failover still works just fine). Confirmed auth issue, under investigation.
    
    Q: Can we somehow use the EOS test infrastructure to provide better testing to XRootd?
    
    Q: what is strategy to go >> 64k filedescriptors? Being looked up by Xrootd team/Andy, these may internal to Xrootd things (with a fixed size memory structure - was signed short, should be easy to change).
  - 6
    
    SWAN
    
    Minutes
    
    Speaker: Jakub Moscicki (CERN)
    
    User reported an issue (cannot save notebooks) due to FST being full.
    
    Will update to 4.2.4 on SWAN (client side change to deal with this, launch converter), and in parallel Luca cleaned up the affected disk.
    
    A: EOSFUSEX should also get the same behaviour.
- development: near-term
  - 7
    nextgen FUSE
    
    Minutes
    
    Speaker: Andreas Joachim Peters (CERN)
    
    CLIENT
    
    tagged 4.2.5 today
    
    not useful to tag a version before visible issues are sorted out
    
    took me 3 days to figure out the Rainer problem (building EOS rpm with eosxd)
    
    three main issues:
    
    removing MD records to early before they could be executed upstream (e.g. deletions) resulting in ENOTEMPTY messages
    
    different behaviour of the kernel cache on EL7 resulting in 'no such file or directory' during compilation
    
    recreation of identical name after deletion has to wait for asynchronous deletion
    
    added the Rainer test to eos-fusex-certify script
    
    tested on El7 and SLC6
    
    inode 1 bug
    
    when you mount 'your' home directory path, the top level directory of the local mount has inode 1
    
    when the cache capability expires for the first time all directories in the top level mount became invisible for ever. Same happened when the first call on the top level directory was not 'ls'
    
    eosxd bugs fixed 4.2.5
    * [EOS-2146] - symlinks have to show the size of the target string
    * [EOS-2147] - listxattr creates SEGV on OSX
    * [EOS-2148] - eosxd on OSX creates empty file when copying with 'cp'
    * [EOS-2159] - An owner of a directory has to get always chmod permissions
    * [EOS-2161] - rm -rf on fusex mount fails to remove all files/subdirectories
    * [EOS-2174] - Running out of FDs when using a user mount
    
    SERVER
    
    tagged 0.3.270 today (problem with the build system to be sorted by Joszef)
    
    made server open atomic
    
    AOB
    
    - implementation of listing of large directories has to be changed on server side to keep NS locks only for 10k entries and then re-lock (avoid write starvation when Massimo lists 10M dirs inside a dir)
    
    - when MGM is down when eosxd mounts, the XrdCl object never tries to re-establish the conneciton, although eosxd replays commands according to the local timeout configuration
    
    - UAT and PPS should be updated to the tagged versions (-> Luca)
    
    - Need to check the YUM repo for aquamarine releases (point to storage-ci, not dss-ci).
    
    - can now limit access to domains ("cern.ch") à la AFS (BE software distribution).
  - 8
    new Namespace
    
    Minutes
    
    Speaker: Elvin Alin Sindrilaru (CERN)
    
    New-catalogue tests (Massimo)
    
    EOSPPS, xroot:4.8.0-0.rc1, fusex: 4.8.0-0.rc1
    
    Client machines in Wigner (eospluswig701.cern.ch) to minimise the MGM-client latency
    
    Xmas tree (ladder of directories 1000 level, 5000 directories):
    
    This test uses FUSEX as well
    
    Factor of 3 faster than old FUSE (both creation and rm -fr) with client and server in Meyrin
    
    Comparable to AFS (client and server in Meyrin)):
    
    #-bash-4.2$ ./ladder.py /eos/pps/users/laman #ladder is going to create 5050 directories #Dir creation: 5.528661 s (5050 dirs) AFS erratic between 10 s and 23 s #Dir removal: 15.508470 s (5050 dirs) AFS erratic between 8 s and 26 s
    
    Large directory
    
    This test is based on eos commands (command line and via python binding)
    
    One dir is created containing a lot of subdir on the same level (d/1, d/2, ... ,d/1000000)
    
    mkdir runs at ~13kHz
    
    In the past, bad behavior of eos fileinfo d (when d contains >>1M files)
    
    10^5 files
    
    eos mkdir for the last dir takes 22 ms
    
    eos ls takes 1.8 s
    
    eos rm -r takes 3.5 s
    
    eos fileinfo d directory: 17 ms
    
    1M files
    
    eos mkdir for the last dir takes 30 ms
    
    eos ls takes 2.4 s (1M directories; output truncated at 50000), See AOB
    
    eos rm -r takes 12.7 s (second test 10.9)
    
    eos fileinfo d directory: 17 ms
    
    10M files
    
    eos mkdir for the last dir takes 36 ms
    
    eos ls takes 4.8 s (1M directories; output truncated at 50000), See AOB
    
    eos rm -r aborts (the dir is a unclear status: visible from fusex, not visible from els ls: (errc=2) (No such file or directory)
    
    eos fileinfo d directory: 20 ms
    
    Rate tests
    
    Parallel mkdir from 4 (5) nodes. Single stream is ~1300 Hz, 4 streams is ~2700 Hz, 5 streams is ~ 2000 Hz.
    
    AOB
    
    Since (I understand) eos find runs server side, why do I get this?
    
    -bash-4.2$ eos find --count /eos/pps/users/laman/largedir nfiles=0 ndirectories=50001 warning: find results are limited for you to ndirs=50000 - result is truncated! (errc=7) (Argument list too long)
    
    (not an admin? need special powers for this)
- other: pilot services, long-term dev, external
  - 9
    
    Webservice
    
    Speaker: Luca Mascetti (CERN)
  - 10
    
    Backup
    
    Speaker: Luca Mascetti (CERN)
  - 11
    
    Samba
    
    Speaker: Luca Mascetti (CERN)
  - 12
    
    $HOME structure
    
    Speaker: Luca Mascetti (CERN)
  - 13
    
    BATCH integration
    
    Speaker: Massimo Lamanna (CERN)
  - 14
    
    Xrootd
    
    Speaker: Michal Kamil Simon (CERN)
  - 15
    AOB
    
    Minutes
    
    Kuba has created testcases, passed on Jozsef.
    
    should test more the multi-client
    
    mount /scratch on PLUS/AIADM "qa" machines (Dan to do merge request).

Choose timezone

EOS DevOps Meeting

513/R-068

CERN

● production instances

EOSATLAS

EOSCMS

● CERNBOX and EOSUSER

● FUSE and client versions

● Citrine rollout

EOSALICE+EOSPUBLIC

● SWAN

● nextgen FUSE

● new Namespace

● AOB

EOSATLAS

EOSCMS

EOSALICE+EOSPUBLIC