AFS phaseout coordinator meeting

Name: AFS phaseout coordinator meeting
Start: 2019-01-25T10:00:00+01:00
End: 2019-01-25T12:00:00+01:00
Location: CERN

Friday 25 Jan 2019, 10:00 → 12:00 Europe/Zurich

513/1-024 (CERN)

513/1-024

CERN

Show room on map

Jan Iven (CERN)

Registration

Participants

Hide

● Intro and AFS phaseout status

(no questions)

● EOS: FUSE to FUSEX

Q: since "autofs" is not working properly on SLC6 - how stable is 'eosxd' on SLC6 compared to CC7?
A (APeters): same stability, but on SLC6 it's basically autofs which cannot unmount the unused FS... still under investigation
A (BJones): some problems might not be spotted on CC7, as there is not enough batch capacity on CC7

C: (EObreshkov) ATLAS is still using a lot SLC6, and will use containers with SLC6... struggling to find out why the EOS-FUSE is not stable in the containers

Q: LDeniau: the "bad example" is used by BE-ABP: ROOT "hadd()" and CONDOR batch jobs, writing many files in a single dir - what to expect?
A (APeters): Try to split over several directories to reduce crosstalk. Try not to use SQLite, if possible or use it only for reading, as writing generates a lot of flock()s; for boosting performance we can try to relax the messaging for each file written
A (Ben): In batch, the default use case is in sandboxes (the input data is loaded, processed and then copied into the shared FS)
A (Jan): see already-existing (AFS-area) best practices for using the shared FS

Q (APfeiffer): Can you do a "make -j 20" from a single machine?
A (APeters): In principle yes, at ~the same speed as AFS; has been tried with -j 4... just because the machine didn't have enough cores; question about why would you run the compilation on the shared FS
A (APfeiffer): output is needed on a shared FS
A (Jan): temporary object files might go outside the shared FS (as best practice - EOS should still work OK, just become slow)

Q: Why do you not recommend to use EOS for the home directories yet?
A (APeters): still some known issues to be fixed before this is made more widely available
A (Jan): nothing major, just that we would like not to waste user's time on things we already know that can be improved before making it widely available
Comment: testing the new EOS FUSEX is important for many users, and could easily become a show-stopper if not done in time

Q: What would be the difference between AFS and EOS, provided all things are fixed?
A (APeters): feature-wise, almost on par; high throughput would be more easily achievable; some things would be slower, due to the fact that eosxd runs in user space (high I/O operations would be slower than on AFS, due to the architecture fuse vs in-kernel)
A (Jan): specific AFS workloads (if using AFS commands, not just filesystem) would need to be translated

Q (APfeiffer): Do you plan long term to have a kernel interface for the EOS client?
A (APeters): due to kernel conservative approach to kernel releases in RHEL/CentOS, we might not want to take this approach

Q (APfeiffer): Would it be possible to push it as a kernel module in order to eliminate part of the performance issues?
A (Jan): Strict kernel policies may prevent for the module to be accepted in mainline. Also fragile interface - contributes to current AFS trouble.

Q: Are 24h backups available?
A: Yes, in a slightly different way; we have the recycle in which we keep removed data for 6 months
A: No tape backup at the moment, and not envisaged; R&D for having snapshots running every day

Q: Can a directory be recovered completely?
A (APeters): Yes, if removed in one go via "rm -r"; we keep versions for files, as well
A [clarification after meeting]: note: FUSE/FUSEX do not create versions for every open/update/close - see EOS-3194

● EOS operational status

Q (Joel): Management of the ACLs: the interface is not very friendly - i.e using CERNBox to set ACLs, seem to have discrepancies between CERNBox / EOS
A (Luca): we plan a CLI to support ACLs
A (APeters): ACLs to be kept in sync between Win/POSIX ACLs

Q: Will the policy of not having publicly available directories in HOME be reviewed?
A (Luca): We could relax this particular behaviour from inside CERN network (for unauthenticated access). Not planning to give truly anonymous access even from outside CERN.

● AFS Phaseout: next steps & planning

During presentation:
Q (APfeiffer): One more thing needed: ACLs (slide 1?)
A: we'll discuss this in the presentation

Q (JClosier): For the default action, why not just delete? (slide 2)
A: we'll try to identify the best solution for each use case if no reaction/decision from the owner

Q: JIRA or SNOW tickets in case of encountering issues? (slide 4)
A: Any kind, please make sure it's linked to the JIRA tracker for the affected AFS area

Q: Why not moving scratch to EOS directly, instead of AFS (slide 5)
A: They might behave differently, so AFS requires less interaction with the user (short notice - first part of 2019). Except most content to be deleted before.

Q (Joel): "work" is /afs/cern.ch/work, or the initial work directories under the home directories ("w0" under home...)?
A: "work" is /afs/cern.ch/work. The old "w0" areas will be handled as part of the "project scratch space" phaseout (please send an example)

Q: Are there issues with 'case (in)sensitiveness' between Windows and Linux?
A: not obvious (yet); things are looking better in more modern OS. But EOS is case-aware, so at worst would end up with duplicate differently-capitalized EOS entries.

● Discussion

After presentation:

Q (Baosong Shan): For the "project" area will there be one instance or more?
A (Jan): in general, number of instances to be kept low.
A (Jan,Luca): negotiable, AMS might get its own instance since already "big".

Q (Baosong): When will EOSPROJECT be ready for testing?
A (Luca): Probably from March

Q (Baosong): Can we test FUSEX+QuarkDB namespace before?
A: Yes, on user/service account home directories - already running the new combination.

Q: Could some users start earlier with testing?
A: Sure. For different communities, arrangements can be made for testing

Q: What if there are plenty of references to different locations? Moving might mean everything or nothing.
A (Massimo): Symlinks should still work as replacement for volume mounts inside AFS, etc.
A (Jan): Only a few of the tools are aware about volume mounts. To prevent unwanted volume traversal, could replace them with symlinks; at filesystem-level this should not break anything (unless being AFS admins and using "vos release" etc)

Q: (Frank Locci) BE-CO client machines would need to be up to date with the latest client but are snapshotted.. What to do in order to do this on prod machines which are staged/snapshotted?
A: Suggest to freeze the version that is at the point in time seeming to be bug-free; update when an important bug in the client is fixed for you. Similar model to LXPLUS=LXBATCH (frequent updates) and Linux desktops (which see fewer versions)

Q (Elena Gianolio): What about the AFS web sites?
A: Agreement with the Web team: migration mostly driven by the Web Team => pointer (i.e URL redirector) could be changed almost at the same time as the website data; would still need validation from the user => test link to be tested => OK from user, before making to prod
Also: the Web Team is investigating containers, which might provide more security to the websites (would no longer using the same shared identity for allowing the web server access)

There are minutes attached to this event. Show them.

- 10:00 → 10:15
  
  Intro and AFS phaseout status 15m
  
  Speaker: Jan Iven (CERN)
  
  NOAFS-20190125-intro.pdf
  
  (no questions)
- 10:15 → 10:25
  
  EOS: FUSE to FUSEX 10m
  
  Speaker: Andreas Joachim Peters (CERN)
  
  EOS AFS Phaseout.pdf
  
  Q: since "autofs" is not working properly on SLC6 - how stable is 'eosxd' on SLC6 compared to CC7?
  A (APeters): same stability, but on SLC6 it's basically autofs which cannot unmount the unused FS... still under investigation
  A (BJones): some problems might not be spotted on CC7, as there is not enough batch capacity on CC7
  
  C: (EObreshkov) ATLAS is still using a lot SLC6, and will use containers with SLC6... struggling to find out why the EOS-FUSE is not stable in the containers
  
  Q: LDeniau: the "bad example" is used by BE-ABP: ROOT "hadd()" and CONDOR batch jobs, writing many files in a single dir - what to expect?
  A (APeters): Try to split over several directories to reduce crosstalk. Try not to use SQLite, if possible or use it only for reading, as writing generates a lot of flock()s; for boosting performance we can try to relax the messaging for each file written
  A (Ben): In batch, the default use case is in sandboxes (the input data is loaded, processed and then copied into the shared FS)
  A (Jan): see already-existing (AFS-area) best practices for using the shared FS
  
  Q (APfeiffer): Can you do a "make -j 20" from a single machine?
  A (APeters): In principle yes, at ~the same speed as AFS; has been tried with -j 4... just because the machine didn't have enough cores; question about why would you run the compilation on the shared FS
  A (APfeiffer): output is needed on a shared FS
  A (Jan): temporary object files might go outside the shared FS (as best practice - EOS should still work OK, just become slow)
  
  Q: Why do you not recommend to use EOS for the home directories yet?
  A (APeters): still some known issues to be fixed before this is made more widely available
  A (Jan): nothing major, just that we would like not to waste user's time on things we already know that can be improved before making it widely available
  Comment: testing the new EOS FUSEX is important for many users, and could easily become a show-stopper if not done in time
  
  Q: What would be the difference between AFS and EOS, provided all things are fixed?
  A (APeters): feature-wise, almost on par; high throughput would be more easily achievable; some things would be slower, due to the fact that eosxd runs in user space (high I/O operations would be slower than on AFS, due to the architecture fuse vs in-kernel)
  A (Jan): specific AFS workloads (if using AFS commands, not just filesystem) would need to be translated
  
  Q (APfeiffer): Do you plan long term to have a kernel interface for the EOS client?
  A (APeters): due to kernel conservative approach to kernel releases in RHEL/CentOS, we might not want to take this approach
  
  Q (APfeiffer): Would it be possible to push it as a kernel module in order to eliminate part of the performance issues?
  A (Jan): Strict kernel policies may prevent for the module to be accepted in mainline. Also fragile interface - contributes to current AFS trouble.
  
  Q: Are 24h backups available?
  A: Yes, in a slightly different way; we have the recycle in which we keep removed data for 6 months
  A: No tape backup at the moment, and not envisaged; R&D for having snapshots running every day
  
  Q: Can a directory be recovered completely?
  A (APeters): Yes, if removed in one go via "rm -r"; we keep versions for files, as well
  A [clarification after meeting]: note: FUSE/FUSEX do not create versions for every open/update/close - see EOS-3194
- 10:25 → 10:35
  
  EOS operational status 10m
  
  Speaker: Luca Mascetti (CERN)
  
  2019-01-25 eos rollout.pdf
  
  Q (Joel): Management of the ACLs: the interface is not very friendly - i.e using CERNBox to set ACLs, seem to have discrepancies between CERNBox / EOS
  A (Luca): we plan a CLI to support ACLs
  A (APeters): ACLs to be kept in sync between Win/POSIX ACLs
  
  Q: Will the policy of not having publicly available directories in HOME be reviewed?
  A (Luca): We could relax this particular behaviour from inside CERN network (for unauthenticated access). Not planning to give truly anonymous access even from outside CERN.
- 10:35 → 10:55
  
  AFS Phaseout: next steps & planning 20m
  
  Speaker: Jan Iven (CERN)
  
  NOAFS-20190125-planning.pdf
  
  During presentation:
  Q (APfeiffer): One more thing needed: ACLs (slide 1?)
  A: we'll discuss this in the presentation
  
  Q (JClosier): For the default action, why not just delete? (slide 2)
  A: we'll try to identify the best solution for each use case if no reaction/decision from the owner
  
  Q: JIRA or SNOW tickets in case of encountering issues? (slide 4)
  A: Any kind, please make sure it's linked to the JIRA tracker for the affected AFS area
  
  Q: Why not moving scratch to EOS directly, instead of AFS (slide 5)
  A: They might behave differently, so AFS requires less interaction with the user (short notice - first part of 2019). Except most content to be deleted before.
  
  Q (Joel): "work" is /afs/cern.ch/work, or the initial work directories under the home directories ("w0" under home...)?
  A: "work" is /afs/cern.ch/work. The old "w0" areas will be handled as part of the "project scratch space" phaseout (please send an example)
  
  Q: Are there issues with 'case (in)sensitiveness' between Windows and Linux?
  A: not obvious (yet); things are looking better in more modern OS. But EOS is case-aware, so at worst would end up with duplicate differently-capitalized EOS entries.
- 10:55 → 11:25
  
  Discussion 30m
  
  After presentation:
  
  Q (Baosong Shan): For the "project" area will there be one instance or more?
  A (Jan): in general, number of instances to be kept low.
  A (Jan,Luca): negotiable, AMS might get its own instance since already "big".
  
  Q (Baosong): When will EOSPROJECT be ready for testing?
  A (Luca): Probably from March
  
  Q (Baosong): Can we test FUSEX+QuarkDB namespace before?
  A: Yes, on user/service account home directories - already running the new combination.
  
  Q: Could some users start earlier with testing?
  A: Sure. For different communities, arrangements can be made for testing
  
  Q: What if there are plenty of references to different locations? Moving might mean everything or nothing.
  A (Massimo): Symlinks should still work as replacement for volume mounts inside AFS, etc.
  A (Jan): Only a few of the tools are aware about volume mounts. To prevent unwanted volume traversal, could replace them with symlinks; at filesystem-level this should not break anything (unless being AFS admins and using "vos release" etc)
  
  Q: (Frank Locci) BE-CO client machines would need to be up to date with the latest client but are snapshotted.. What to do in order to do this on prod machines which are staged/snapshotted?
  A: Suggest to freeze the version that is at the point in time seeming to be bug-free; update when an important bug in the client is fixed for you. Similar model to LXPLUS=LXBATCH (frequent updates) and Linux desktops (which see fewer versions)
  
  Q (Elena Gianolio): What about the AFS web sites?
  A: Agreement with the Web team: migration mostly driven by the Web Team => pointer (i.e URL redirector) could be changed almost at the same time as the website data; would still need validation from the user => test link to be tested => OK from user, before making to prod
  Also: the Web Team is investigating containers, which might provide more security to the websites (would no longer using the same shared identity for allowing the web server access)