Executive summary

Done:
- Kibana [1] and Grafana [2] visualizations of the FTS information exist, as well as code to perform queries.
In progress:
- SNOW request for InfluxDB at monitoring instance to store combined FTS & IP prefix info
- presentation slides for LHCONE/LHCOPN meeting
- writing IP prefixes into AGIS
Next:
- more plots (pre-staging stats, ongoing transfers)
- talk with Joaquin about estimated time-to-complete for file transfers
- collect ideas for authentication to API
- finish presentation
- use information about empty space in data centers to refine estimate
Next meetings:
- ATLAS meeting: 17 June 4 PM (@Mario: correct?)
- next NOTED meeting 1 July 3:30 PM

What was done

There now exist Kibana [1] and Grafana [2] visualizations of the FTS information, as well as Python code to automatically query the FTS database via the Grafana proxy.
This allows for adding the IP prefix information, and uploading this enriched data to another database.
This database could be an InfluxDB at the monitoring service (SNOW request already open) -> access via Grafana proxy; also, there is the option to create a KPI [3].
Alternatively, a monitoring-independent ElasticSearch database could be used.
AGIS has confirmed that they can integrate the IP prefix information; they are provided with the parsed twiki-table-information and are currently putting it into AGIS.
The presentation slides are in progress.

Key Insights

- file transfers destinations are more probable towards data centers with a lot of empty space
    -> use information on what data centers were recently "cleaned up"
        to refine estimation of IP prefixes

- the files that first need to be fetched from the tape have a special status;
    in a first pre-staging phase, they are written to a disk, and then transferred.
    This process sometimes takes a day, sometimes some hours (difficult to predict).

- Joaquin has been working on estimating the transfer time (for queuing and transmitting)
    -> can give good input and feedback to the project

- providing an entirely public REST API is dangerous,
    since the queries could overload the database
    -> need some form of authentication
    -> ask at LHCONE/LHCOPN meeting

- since the IP prefixes are now integrated in AGIS, they will (with very high probability)
    also be integrated in CRIC

- there is a public monitoring instance in Chicago
    -> look at what sort of authentication they use, and whether query interface is public

Decisions

- since CRIC/AGIS has/will have an integration of IP prefixes,
    we don't need to store the information in RIPE (for our purposes)

Next steps

Mario:
- put into contact with Joaquin (already done, thanks a lot!)
- send time/place of ATLAS meeting,
    and how many minutes the presentation should approximately take
    (time would be Monday, 17 June, 4 pm, right?)

Andrea:
- explain information on pre-staging stats

Coralie:
- write code to combine the IP prefix and FTS information
    and then push it to the new InfluxDB
- finish presentation
    - integrate: what kind of information is available in the databases
    - ask what sort of authentication preferred
- add plots
    - already ongoing transfers, how much % of data left to transmit
    - pre-staging stats
- talk with Joaquin about estimated time-to-complete for file transfers
- collect ideas for authentication to API
- use information about empty space in data centers to refine IP prefix estimate

Next meeting

1. July, 3:30 pm, 31/S-010   

 

[1] https://monit-kibana.cern.ch/kibana/app/kibana::/dashboard/AWq_fXKqNqzpRjqCuVZ_?_g=()

[2] https://monit-grafana.cern.ch/d/G5oAZRZZk/fts-noted?refresh=5s&orgId=25&from=1558509377126&to=1558509977127&var-group_by=vo&var-vo=All

[3] http://monit-docs.web.cern.ch/monit-docs/ingestion/service_metrics.html#sending-kpis