- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
INVITATIONDear Invenio developer or user, We would like to announce the forth Invenio User Group Workshop, to be held by the Heinz Maier-Leibnitz Zentrum (MLZ) on the research campus of Garching from Tuesday, 21 March to Friday, 24 March 2017. This workshop is jointly organized by CERN and MLZ for JOIN2. It is intended for Invenio administrators and will consist of a series of lectures, practical exercises, and discussions with Invenio developers. The goal is to enable better understanding of Invenio features and capabilities, to discuss specific needs, forthcoming features and developments, etc. The Invenio User Group Workshop 2017 will address a wide range of topics related to practical aspects of running digital repository services. We welcome proposals for presentations especially on the following themes: 1. Invenio for libraries 2. Invenio in the Open Access world 3. Invenio for service managers 4. Invenio for multimedia 5. Invenio for research data 6. Proposals for Tutorials are also welcomed, with practical hands-on sessions aimed at developers or system managers. Any other topics of interest as well as reports on your own experience with Invenio are most welcome. Abstract submission and registration are open now! Please circulate this workshop link also among colleagues who might be interested in. Looking forward to seeing you.
CERN and MLZ for JOIN2
Twitter hashtag: #IUGW2017 |
JINR open access repository, JDS (JINR Document Server), launched on the Invenio platform is functioning since 2009. Started with Invenio v.99 and now updated it to v.1.2.2.
JDS collections include published articles, books, theses, conference proceedings, audio, video materials, etc. Various methods of ingesting of documents into JDS and updating its content are applied: submission by authors, harvesting, (automatic) uploading. Further development of JDS is connected with the project “JINR corporate information system” aimed as information support of scientific researches performed at JINR. Within the project we are creating a collection “Authority” which is intended to be a core of this system.
Invenio database design and interfaces are optimized for fast end user
search and retrieval. As administrators, we can add indexes at will
and use them via web or API. However, many maintenance tasks are not
well covered with those indexes.
For most of those cases, reading the records sequentialy is the
optimal solution. However, if the database is large enough, reading
them via Invenio API may take hours, while the system slows down and
it may become unresponsive.
In this presentation I'll show a small Python tool that uses Invenio
API and a SQLite database as cache to keep an up to date flat file
with your bibliographic records.
We'll see how whith this flat file it is much faster and easier to do
tasks like generate specialised statistics, quality control, automatic
record enrichment or cleaning, or even creating exotic indexes or
counters.
We present an extension to the Invenio 1.1 software for semi-automatically harvesting ORCID IDs of users and allowing them to upload publications to their respective ORCID profile. This extension was created in the context of the Join2 initiative, however, it can easily be adapted to other Invenio instances because it is only loosely coupled with Invenio itself. It opens its own local webserver to handle the additional endpoints, and calls Invenio API functions and command line programs to interact with the database. We also present a recommended workflow for successfully harvesting ORCID ID in an institution. The implementation is realised in well-documented Python 2.6 and Go and will be published as Free Software.
As a join2 partner, DESY library uses Invenio already for it's publication database and institutional repository. The next logical step is to also migrate the library catalogue from the currently used Aleph system to Invenio. Starting out with a short introduction of how to migrate Aleph. This includes the migration of bibliographic data as well as holdings but also movement data, current loans etc.
The talk also outlines some of the new additions required to run Invenio as an ILS at DESY based on the infrastructure already existing. E.g. it is necessary for DESY to interact with RFID based self service terminals, barcode based library cards and external patrons who have not DESY account etc.
An important base of the common JOIN2 repository infrastructure of DESY, DKFZ, FZJ, GSI, MLZ and RWTH Aachen are about 134 000 authority records for grants, projects, large-scale infrastructures, cooperations, journals, and different kinds of keys. All instances are using the authorities together.
We will present how these authority data are used for different purposes e.g. the recent and upcoming obligations to report to regard to our funding and the data export to openAire. Furthermore, we discuss this in dependence to the German “Kerndatensatz Forschung”, which will be the new standard for future.
When harvesting information from different sources it is necessary to identify
identical objects. If both have the same unique identifier like a DOI or a
report-number this is trivial but unfortunately a rare case.
Most of the time matching is mainly based on author and title information.
However, titles may change significantly from preprint to publication and
depending on the type of the publication (journal paper, conference contribution,
thesis) even identical basic metadata would lead to separate records.
In general a two-step process is needed:
a) search for potential candidates. Here it is necessary to define a search query
with a high efficiency. However, if the search is too fuzzy, the number of records
as search result is too large and matching becomes not feasible. Restriction to a
limited scope of records is helpful.
b) confirmation of the match. Depending on the strategy clear results can be
treated automatically, whereas doubtful cases might be presented to a human for
final decision. In both cases it is essential to have enough information.
For a reliable match good quality of uniform metadata is essential and in many cases processing of content information like abstract, references or fulltext is needed.
Once two records have been identified as equal or existing information receives an update, the information needs to be merged. There are obvious cases where one source always supersedes another, maybe some information comes only from one source. But to add e.g. an ORCID from one source to the author and affiliation from another source requires the identification of corresponding information.
Experience from INSPIRE shows what is currently done (fields with controlled vocabulary), what is doable (fields where the content can be identified) and where merging is not feasible but one version simply overwrites another.
What can be done automatically, which tools are needed, when is human intervention necessary? When is it worthwhile to overwrite (i.e. delete) manually curated, high quality information?
Institutions and funders are pushing forward open access with ever new guidelines and policies. Since institutional repositories are important maintainers of green open access, they should support easy and fast workflows for researchers and libraries to release publications. Based on the requirements specification of researchers, libraries and publishers, possible supporting software extensions are discussed. How does a typical workflow look like? What has to be considered by the researchers and by the editors in the library before releasing a green open access publication? Where and how can software support and improve existing workflows?
The publication landscape is about to change. While being largely operated by subscription based journals in the past, recent political decisions force the publishing industry towards OpenAccess. Especially, the publication of the Finch report in 2012 put APC based Gold OpenAccess models almost everywhere on the agenda. These models also require quite some adoptions for library work flows to handle payments, bills and centralized funds for publication fees. Sometimes handled in specialized systems (e.g. first setups in Jülich) pretty early on discussions started to handle APCs in local repositories which would also hold the OpenAccess content resulting from these fees, e.g. the University of Regenburg uses ePrints for this purpose.
Backed up by the OpenData movmement, libraries also saw opportunity to exchange data about fees payed. Thus, OpenAPC.de was born in 2014 on github to facilitate this exchange and aggregate large amounts of data for evaluation and comparison. Using the repository to hold payment data usage of OAI-PMH is immediate. Thus, join2 and the University of Regensburg developed an interchange format for APC data that allows easy and automatic delivery to OpenAPC.
This talk outlines a working solution for APC management and hook up with OpenAPC based on Invenio as implemented in join2.
Research data management is a duty a university or research institute can not ignore any longer. But setting up a suitable infrastructure is cumbersome and ill-supported by national or international infrastructures yet, in particular in Germany [1]. At the same time monolithic IT solutions encompassing the whole data lifecycle as well as the entire university or research institute are not an option since there is much too much development and there are far too many changes and disciplines involved, in particular when looking into solutions that really support individual research units.
There are some prominent projects, mainly ZENODO (http://zenodo.org) and EUDAT (http://eudat.eu), funded by the EU that make use of the Invenio framework mainly for publishing research data.
Yet publishing is only one component of research data management. How about keeping data not be published, long-term preservation, or linking publications to its foundational data? Various different approaches and tools support different aspects of research data management and need to be combined into a holistic and adaptable service suite.
This presentation shows how RWTH Aachen University makes use of the Invenio and in particular the JOIN2 infrastructure as a module within this service suite. DOI minting, linkage between data records and towards authority files for people, institutes, and projects, and alternative storage facilities are some of the topics that will be addressed. Overall, we point out current achievements as well as open challenges.
[1] Leistung aus Vielfalt: Empfehlungen zu Strukturen, Prozessen und Finanzierung des Forschungsdatenmanagements in Deutschland, Göttingen : Rat für Informationsinfrastrukturen, URN: urn:nbn:de:101:1-201606229098, 2016
Invenio 3 validates metadata format using JSON Schemas. This presentation will show how B2Share enables its users to create their own custom schemas and share them with other communities.
On this presentation, a new record editor will be presented. Current version under development can be found in https://github.com/inveniosoftware-contrib/ng2-json-editor. This editor uses JSON as its native data format, provides many configuration options and can handle very large JSON documents. An update on the development status and pointers to how to use it in your own installation will be provided.
This talk will present the different Machine Learning tools that the INSPIRE is developing and integrating in order to automatize as much as possible content selection and curation in a subject based repository.