IT-ASDF: Evolution of MySQL High Availability in the DBOD, Phishing protection changes in MS Defender

Europe/Zurich
513/1-024 (CERN)

513/1-024

CERN

50
Show room on map
Florentia Protopsalti (CERN), Roberto Valverde Cameselle (CERN)
Zoom Meeting ID
63445832154
Description
IT Activities and Services Discussion Forum (ASDF)
Host
Jorge Garcia Cuervo
Alternative hosts
Charles Delort, Karolina Przerwa, Stefan Nicolae Stancu, Enrico Bocchi, Nikos Papakyprianou, Pablo Martin Zamora, Ismael Posada Trobo
Useful links
Join via phone
Zoom URL

Evolution of MySQL High Availability in the DBOD

Q: There will be no online replica between two of the steps. If we lose the (only remaining) primary database at this stage, recovery from backup would be needed. How long this “at risk” situation will take? Should we set up some protective measures for this period (like setting our application read-only)?
A: Depends how do you find data loss, and the unavailability of the SSO. If it goes down we can bring it back but in case of a complete run failure we have to restore the full back up. There is no automatic way to failover so 35’ for Keycloak


Q: Is it necessary to go back to back up as you have an updated database on the other 2 machines?
A: There are many ways we can do the recovery. Depends on at which point it fails maybe there is a more up to date replica.


Q: The size of SSO DB is 25GB, what is in it?
A: It contains a very large table with audit logs for authorisation that we have not been able to clean after the incident on the Keycloak database last year. The oldest data is only from December 2022, but this table is indeed the largest one, representing almost 60% of the whole database, and half could be purged.


Q: When will it evolve to Postgres HA?
A: We had requests for it but there are no resources.


Q: There was a PG HA cluster available once until it got killed.
A: There was a pilot in the past that was successfully tested but we had to close it as there were no resources to move it into production.


Q: Does anyone know what makes the drupal DB that huge. Does it store binary file data in the db?

A: Drupal has all the web pages in the database but only the attachments stored in CephFS


 

Phishing protection changes in MS Defender

Q: In the presentation, the title talks about the IT department, one slide talks about a limit of 350 and another about the whole domains. What is exactly enabled and for whom?
A: These 350 emails (currently 326) will be monitored if anyone will receive an email that resembles the fake ones and then the policy is applied to a group of users (whole IT department). That means that if for example user X is not in the list of the 350 but the user Y is in the list and user X receives an email that tries to impersonate the display name of Y, so both get locked because the policy applies to user X and the account of Y is protected but if the opposite happens then the policy is not applied.


Q: Who is in the list of 327?
A: Currently is the IT department , it-dept-dynamic? The IT wants to get the list of the DG, the admin sector accounts, etc


Q: We are protected only if they try to impersonate us and not by emails sent by e.g Joe B?
A: yes


Q: The safety tips were enabled so we will receive the tips in case they are triggered?
A: It is uncertain, if you will receive the tips only for low confidence impersonations or you will receive tips in any case.


Q: Do we have to check also the trash folder apart from Junk?
A: No, only the junk one.


Q: As it is difficult to find from MS what they changed, do we have a way of replaying a test dataset of emails in order to measure what is happening?
A: Only if MS was adding assignments to the headers then we could use that information to replicate and analyse.


Q: if you cannot do it this way, is there a way that you can measure and analyse the impact so you can qualify a change by a testset of spam emails?
A: It could be possible but we need resources as the phising logic from MS is changing all the time so in the future the same emails we are testing today they may not be useful to understand a change. This change most probably won’t have any impact. If the change was more drastic it would make more sense to do it.


Q: We are going through MS filter twice, is it possible to have the first run in level 1 and the second stage after XORLAB running after level 2?
A: I am not sure how is the technical implementation.


Q: You run round one and then you pass to two and you see if there is something more. There will not be any change because the filtering will not impact level 2. We can control the numbers so there will not be any noise for the users. is it possible?
A: It is possible as there is a special connector between MS and XORLAB and once the emails come back, we whitelist based on a set of headers set by XORLAB. But there is no way of setting a special type of threshold.


Q: Is the non-understandable “AI” part here making a significant difference to our previous approach (which was based on other external datasources or whatever)? We are looking at this aspect of AI as part of other privacy risk assessments.
A: we should have a look and add it to the list of Risks.


Q: Does this mean in the past we were able to get explanations about why things were quarantined, and hence had a better understanding?
A: yes

There are minutes attached to this event. Show them.