Data Knowledge Catalog Meeting (TPU/NRC KI)

Europe/Moscow
Other Institutes

Other Institutes

Vidyo room
Marina Golosova (National Research Centre Kurchatov Institute (RU))
Description
Vidyo room 'DKB'

DKB@ATLAS instance

Current status: OK.

Issues:

  • Stage ???: invalid message generated (Stage 040 (progress) reports missed field task_timestamp);
  • Stage 009 (oracleConnector): offset committed before the whole process is finished;
  • Stage 069 (upload2ES): input data are valid JSONs, but ES bulk API reports:
    Malformed action/metadata line [13], expected START_OBJECT or END_OBJECT but found [VALUE_STRING].

Git repository

  1. GC:
    • unused files removed (not so many as one could expect);
    • Virtuoso-related things moved to /Utils/Virtuoso.
  2. Docs:
    • HTML version removed;
    • PDF version auto-generated and updated via (auto-created and auto-updated) PR to master.

Batch processing

Worker-driven version is under construction.

Implementation details to be discussed via e-mail.

ES scheme update (parent/child → nested field)

Plan:

  • new mapping (#391);
  • reindex all data (do we have enough disk space?);
  • new integration process (#392);
  • run the new integration process in parallel with the current one (on a daily basis?);
  • new API endpoint that works with the new scheme;
  • performance tests;
  • drop the old version.

ToDo:

  • inform M.Borodin that direct queries to ES will soon be broken (mgolosova).

Ivannikov WS (paper)

Deadline: 2020/08/25 (next Tuesday!)

Requirement: 3-7 pages (eng), 8-20 pages (rus)

ToDo:

  • make a decision (come up with a topic or retreat) before the end of the week (tomorrow!) (mgolosova).
There are minutes attached to this event. Show them.