Speaker
Description
The document converter service provides conversion of most office and some engineering applications to PDF, PDF/A or PostScript. The service has been completely rewritten as an OSS [1] and is based on modern IT technology fostered by the CERN IT department. It is implemented as a RESTful API with a containerised approach using the Openshift technology, EOS storage to store documents and jobs, PostgreSQL database, Python3, flask, a Kibana dashboard based on Elastic and its documentation based on gitbook.
The project has been conceived having in mind a multiprocessing design, which allows handling simultaneously several jobs and reducing by more than half the time to process documents, compared to the old service incarnation. It allows adding different converter software, presently using Neevia [2]. The design allows to easily scale up thanks to the technology used, which is HAProxy + Openshift as web interface and Openstack VM’s, Windows 2012R2 servers, as worker nodes in the backend.
Currently, the document converter service is mainly used by services like Indico [3] or EDMS [4] to automate conversion of thousands of documents.
[1] https://github.com/CERNCDAIC/doconverter
[2] https://neevia.com/
[3] https://indico.cern.ch/
[4] https://edms.cern.ch/ui/