22-24 June 2011
University of Geneva
Europe/Zurich timezone

Tutorial 1. MarcXimiL : near duplicates detection (and similarity analysis)

22 Jun 2011, 09:00
2h 30m
5183 (University of Geneva)

5183

University of Geneva

Speakers

Dr Alain Borel (Ecole Polytechnique Fédérale de Lausanne (EPFL)) Mr Jan Krause (University of Geneva)

Description

MarcXimiL is an open source tool which works on MARCXML records and calculates similarity indices between these records. After a short theoretical introduction, the tutorial will focus on how to install, parametrize and use the tool. This tool can be implemented in order to : * prevent creation of duplicates (similar records are shown during the validation process) * identify duplicates into batch files before ingest * find duplicates inside a collection * suggest to users similar records to the one found after a request * match related documents eg. preprints and articles * and so on. http://marcximil.sourceforge.net/
Your name Jan Krause & Alain Borel
Your affiliation/institution University of Genveva & Ecole Polytechnique Fédérale de Lausanne
Your email jan.krause@unige.ch

Presentation materials