Integration of the new benchmark in the accounting workflow
Attended:
Carlos, Alessandro, Josep, Cecilia, Alessandro, Tom, Greg, Matt, Julia
Excused:
Adrian, Maria, Renato
Discussion of the problem of the EGI portal
Apparently the reason of data duplication in the portal is the fact the ServiceLevelType in the EGI repository DB and in the data set recieved from APEL have different values ('NONE' vs non defined). Since ServiceLevelType is a part of the primary key of the accounting records in the EGI DB, if this field for the same accounting record has a different value compared to the same record which already exists in the DB, the new record gets inserted in the DB instead of replacing the existing one, which results in doubling of the accounting metrics when data are aggregated per month.
There are two problems there:
- Why APEL data arrives with not properly defined ServiceLevelType (Greg's example for RAL)? Up to now not clear and needs more deugging
- How to avoid duplication with data which is already in the EGI DB?
Julia suggested to clean target tables in the EGI DB and restart APEL collector and see what happens. The assumption is that after data is re-feed from APEL into empty tables, accounting metrics should be consistent with the ones in the current backup (before data duplication). If it is the case, the only thing is to fix the problem with ServiceLevelType for data arriving from APEL, it should be defined as HEPSCORE or HEPSPEC. And then clean EGI repository tables again and re-start APEL collector
Cecilia told that QUANTA team won't do it on Friday before the weekend, but rather on Monday. Adrian should be back on Monday, this will certainly help debugging of the problem.
Julia asked about topology issue. She explained why WLCG topology should take CRIC as a primary information source. Since QUANTA team is overwhelmed with current tasks, Julia suggested the WLCG Ops team would provide the script which would chase inconsitencies in the topology between CRIC , GocDB and EGI portal and would send an alarm when such inconsitencied are detected and should be manually fixed.
Julia also thinks that there is currently no topology synchronisation between GocDB and EGI portal as well, since two new T1 sites are not in the T1 list in the EGI portal. Josep and Cecilia will check by he next meeting.
Matt told that when the problem with the portal is clarified and resolved, EGI will lead a full postmortem of the incident.
Other discussion
Josep asked where information for proper A&A for the portal should come from.
Julia told that as much as WLCG is concerned, user level accounting is not required. She asked whether this feature is required for EGI. If we drop user-level accounting we can get rid of per-job accounting overall and the accounting workflow can become much simplier. This requires more investigation on the EGI side understanding whether sites and non-LHC user communities really need per-user and per-job accounting info in the portal.
Next meeting is next Friday, 14th of March 11:30.