- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
Test version of a Kibana dashboard to display /var/eos/report/* of the last 30 days soon available.
(Access is restricted (ACL on public cluster), currently does not have EOSUSER data. Infrastructure is OK for sensitive data)
(from Andreas):
EOSUSER updated to 0.3.268 (done Wed night), saw an issue with directory/file visibility. Under discussion.
From Georgios:
From Dan:
From Jan:
Q: Joel: can we block external FUSE access to CERN? No for "old" EOSFUSE (same ports/protocols as other clients). New EOSFUSEX needs new ports which could be left closed on external firewall. Use case?
Current version of EOS FUSE deployed on SWAN: eosd 4.1.30-1 xroot 4.6.1-1
https://twiki.cern.ch/twiki/bin/view/DSSGroup/SwanMachineList
EOS Console [root://localhost] |/eos/dev/fuse/slc7/> fusex ls -l
client : eosxd eos-certify-sl6.cern.ch 4.2.1 online Tue, 14 Nov 2017 09:51:02 GMT 0.62 497.86 5338425e-c921-11e7-ab63-02163e007a0d caps=0
...... ino : 134
...... ino-to-del : 0
...... ino-backlog : 0
...... ino-ever : 136
...... ino-ever-del : 2
...... threads : 40
...... vsize : 0.598 GB
...... rsize : 0.071 GB
eos fusex hb {1..15}
#15 | Source "/root/eos/fusex/kv/RocksKV.cc", line 56, in ~unique_ptr
| 55: /* -------------------------------------------------------------------------- */
| > 56: RocksKV::~RocksKV()
| 57: /* -------------------------------------------------------------------------- */
| 58: {
| Source "/usr/include/c++/4.8.2/bits/unique_ptr.h", line 184, in operator()
| 182: auto& __ptr = std::get<0>(_M_t);
| 183: if (__ptr != nullptr)
| > 184: get_deleter()(__ptr);
| 185: __ptr = pointer();
| 186: }
Source "/usr/include/c++/4.8.2/bits/unique_ptr.h", line 67, in ~RocksKV [0x60521e]
64: {
65: static_assert(sizeof(_Tp)>0,
66: "can't delete pointer to incomplete type");
> 67: delete __ptr;
68: }
69: };
#14 Object "/root/build-master/fusex/eosxd, at 0x696b50, in rocksdb::TransactionDBImpl::~TransactionDBImpl()
automatically renice 'eosxd' process when running as root on Linux to highest priority
# shared mount
mkdir -p /eos/uat/
mount -t fuse eosxd -ofsname=eosuat.cern.ch:/eos/scratch /eos/uat/
# gateway mount
mkdir -p /eos/uat/
mount -t fuse eosxd -ofsname=gw@eosuat.cern.ch:/eos/scratch /eos/uat/
Discussion: the new FUSE client will not put anything into the recycle bin. This is OK for some use cases (and prevents accumulation of tons of junk), but will lead to genuine data loss (and surprise users). Needs to be decided -> JIRA.
EOSLHCB | EOSCMS | EOSBACKUP | |
---|---|---|---|
real | 9m3.031s | 23m43.948s | 91m37.767s |
user | 24m57.936s | 69m58.470s | 319m50.037s |
sys | 16m5.058s | 34m33.558s | 155m25.625s |
no. files | ~26M | ~82M | ~600M |
no. dirs | ~10M | ~21M | ~9M |
(actual numbers would be +25%, missing one RocksDB operation). Size on disk is 130GB for biggest (EOSBACKUP), similar to existing.
Proposal: move EOSPPS to new namespace (went to 4.2.1 today). Want existing data, want tests running against this.
Might affect parallel BEER tests (just gets us better testing). Decision: do it (Elvin+Luca). Will use the new Puppet profile (for MGM this is just 1 line in config).
(note: can also use new EOS FUSE client against this, but unrelated)
Only 99 jobs could run Task 26436 starts at Tue Nov 14 11:54:42 2017 and ends at Tue Nov 14 12:29:10 2017 (34.5 minutes) Analysed jobs: 99 Correct jobs: 99 Maximum concurrency: 3 Execution hosts (top 5): b64972dff9 [#79] b6b6d2ee00 [#13] b6a9b16987 [#7] Execution environments (top 5): eos-client-4.2.0-3.el6.x86_64, eos-fuse-core-4.2.0-3.el6.x86_64, xrootd-client-libs-4.7.0-1.el6.i686, xrootd-client-libs-4.7.0-1.el6.x86_64 [#99]
The one having a problem was hit by the following problem (on the bigbird08 / the sched, not the execution node)
171114 12:14:56 time=1510658096.525786 func=open level=INFO logid=0bb5aad2-c92d-11e7-8b04-a4c64f4165ae unit=mgm@eosuser-srv-1tb1.cern.ch:1094 tid=00007f54a25fd700 source=XrdMgmOfsFile:745 tident=AAAAAAAE.124132:736@bigbird08 sec=krb5 uid=15197 gid=1665 name=laman geo="" msg="client acting as directory owner" path="/eos/user/l/laman/condorTest/outputs/condor.26436.53.log"uid="15197=>1665" gid="15197=>1665"
171114 12:14:56 time=1510658096.525922 func=Emsg level=ERROR logid=0bb5aad2-c92d-11e7-8b04-a4c64f4165ae unit=mgm@eosuser-srv-1tb1.cern.ch:1094 tid=00007f54a25fd700 source=XrdMgmOfs:610 tident=AAAAAAAE.124132:736@bigbird08 sec=krb5 uid=15197 gid=1665 name=laman geo="" Unable to access quota space /eos/user/l/laman/condorTest/outputs/condor.26436.53.log; Network is unreachable
171114 12:14:56 time=1510658096.539127 func=FSctl level=INFO logid=0afacd8e-c92d-11e7-8b04-a4c64f4165ae unit=mgm@eosuser-srv-1tb1.cern.ch:1094 tid=00007f4adb1fd700 source=Commit:163 tident=daemon.79611:141@p06253947j30909 sec=sss uid=2 gid=2 name=daemon geo="" subcmd=commit path=/eos/user/l/laman/condorTest/outputs/condor.26436.48.log size=1268 fid=2b04d7f4 fsid=1212 dropfsid=0 checksum=5f6a3c9f mtime=0 mtime.nsec=0 oc-chunk=0 oc-n=0 oc-max=0 oc-uuid=
(NB: inline graphs don't show? try "edit" mode..)
Name space analysis.
File age (creation time) for the last 3 years ("regular files")
File age (creation time) for the last year ("trashed files")
File age (creation time) for the last year ("versions files")
Size distribution. The shape is essentially the same for regular, thrash and version files. I do not understand the jagged appearance of the "versions"