Attended:

David, Alessandra, Julia, Pablo, Hector, Andrea, Nicolo, Maarten, Marian, Salvatore, Pepe

Apologies: Lionel

 

Open tickets

_____________________________________________________

Out of 65 tickets 12 are opened

Walked through the open tickets

The tickets which relate to the SAM/Nagios modification/simplification should move to a different tracker. Marian will take care.

Test scheduling timeout issue will be discussed at the December GDB. Marian will present the statistics of timeout probes with condorG and cream-CE submissions. Compared to February this year, the situation substantially improved. However there is not ideal solution for it.

 

SAM3 in production

Round table

CMS (Andrea)

Andrea works on the script which would extract site availability from SAM3 and import it into SSB. The automation of the comparison of the availability results between two systems is not yet in place, though just manual comparison looks good. Andrea intends to spend several days on the comparison and then SAM3 would become a part of the production monitoring flow (integrated with site readiness, SSB, etc…)

 

ALICE (Maarten)

No complains. ALICE does not use SAM as much as ATLAS and CMS. Maarten does not foresee any problems.

 

 

ATLAS (Alessandra and Salvatore)

Salvatore does not see any issues apart of the fact that some of the tests are not visible in the UI, but this is because they were not added to the profile. Comparison of the availability metrics were performed and looks fine, if there were differences they were for good reason.

 

Nobody from LHCb, Pablo will ask Stefano after the meeting.

 

No show stoppers so far.

We decide by the end of the month whether we still run SAM2 for one more month. The current plan is to block web access to SAM2 by the middle of December, and if nobody complains after that stop all other components

Now people should investigate any dependencies in the experiment systems on SAM2 and upgrade if needed.

 

ATLAS asked for a programmatic interface which was provided by SAM2. Pablo told that he would like to know which APIs are needed and then enable them, rather than doing full copy of all APIs which existed in SAM2.

 

Access to SAM3 UI. Can contain sensitive information for the site. So the access will be protected with the certificate. Would be also relevant for APIs. Should check that experiments are ready to use APIs with the secure access.

Pablo pointed to Pepe site nagios plugin has to be modified as well to be able to use secure access.

 

SAM3 recomputations

____________________________________________________

Pablo is happy that the recomputations won’t be done any more centrally. The instructions how to do recomputations was validated by Andrea who went through the exercise. According to Andrea, it is pretty straight forward, he already passed the instructions to the site support team.

 

The procedure how recomputation should be handled and requested inside the experiments is up to the experiments. However, since sites do request the recomputation, they should be aware how they can request the recomputation and what is the procedure for acceptance. The suggestion is to continue to handle it through the GGUS tickets. But the tickets should be assigned to the experiment teams rather than the WLCG monitoring team. Decided to keep common tracker at least in the beginning. 10 days for recomputations, reports have to be ready before the MB meeting. The entry point should stay the same for the sites, the person in the monitoring shift re-assigns it to the corresponding team in the experiment.

Andrea mentioned that it is possible that in CMS the request will be directly submitted in the CMS internal tracker and it won’t be visible outside the CMS, but it is not a problem.

 

AOB

__________________________

What to do if someone asks about history for n hours (days) which includes a non-complete hour (day)

Possible alternatives:

a)Exclude the current bin

b)Include the current bin , so that you get a bit bigger time interval than you requested

c)Count the latest incomplete bin as a complete one (slightly shorter time range than requested)

 

Currently SAM2 returned only complete bins ignoring the latest incomplete time bin. This is not optimal since does not allow to see the history with the latest evaluation.

Chose the last (c) option.

 

Nicolo

Need to change the way of CMS SEs discovery which is currently taken from SiteDB. Took offline.

Pablo suggested to meet on the 5th of December