Databases --------- * most critical service * RAC + DataGuard, down << 1% * old hardware on standby during transition period * Streams replication: online --> offline, T0 --> T1, OpenLab collaboration * CMS: Frontier, Squid * 3D project for sharing policies and procedures * 24x7, but still best effort * with more memory fewer physical reads * DB usage increase should be at T1, not T0 * DB dashboard for easy monitoring, technology also available to T1 * most applications guided to 1 preferred node each, better cache utilization, less intracluster traffic * locked owner accounts to avoid accidents (e.g. drop table) * SAM/GridView biggest consumers * T1: no problems, hardware upgrades foreseen * power cut: - 1 Ethernet switch not on critical power - faulty OEM agents scripts prevented automatic startup * completing migration to 64-bit and 10.2.0.4, first T0, then T1 (3D) * Streams setup improvements * reliable, manageable service * close collaboration between application developers and DBAs ATLAS DB -------- * reprocessing launched at the end of May * need ~1k concurrent Oracle sessions * some sites not yet OK/tested * 3D streaming to T1 OK * DCS (slow control) has largest volumes * replication to calibration sites OK * reprocessing: DB average load OK, bursts limited by capacity * more tests foreseen * T1 firewall issues --> use proxy CASTOR DB --------- * background bulk query for sync. between stagers, disk servers, name space: - slowed down name server during backup - sync. suspended during backup, DB disks defragmented * stager_rm slow in certain cases --> fixed by forcing index use via hint * deadlocks between concurrent requests --> fix coming * too many concurrent connections: - lowered number of connections - lowered number of SRM threads - split DB into several RACs * increase during CCRC'08-2 not large compared to continuous activity