After the recent incident, we have restored access to the website from outside the CERN network, however access from certain worldwide locations is still being blocked.

CERN Accelerating science

Talk
Title Things you can do dumping your Invenio database into a flat file
Video
If you experience any problem watching the video, click the download button below
Download Embed
Mp4:Medium
(800 kbps)
High
(2000 kbps)
More..
Copy-paste this code into your page:
Copy-paste this code into your page to include both slides and lecture:
Author(s) Jorba, Ferran (speaker) (Universitat Autònoma de Barcelona)
Corporate author(s) CERN. Geneva
Imprint 2017-03-22. - Streaming video.
Series (Invenio User Group Workshops)
(Invenio User Group Workshop 2017)
Lecture note on 2017-03-22T09:00:00
Subject category Invenio User Group Workshops
Abstract Invenio database design and interfaces are optimized for fast end user search and retrieval. As administrators, we can add indexes at will and use them via web or API. However, many maintenance tasks are not well covered with those indexes. For most of those cases, reading the records sequentialy is the optimal solution. However, if the database is large enough, reading them via Invenio API may take hours, while the system slows down and it may become unresponsive. In this presentation I'll show a small Python tool that uses Invenio API and a SQLite database as cache to keep an up to date flat file with your bibliographic records. We'll see how whith this flat file it is much faster and easier to do tasks like generate specialised statistics, quality control, automatic record enrichment or cleaning, or even creating exotic indexes or counters.
Copyright/License © 2017-2024 CERN
Submitted by jean-yves.le.meur@cern.ch

 


 Record created 2017-04-13, last modified 2022-11-02


External links:
Download fulltextTalk details
Download fulltextEvent details