Second International Symposium on Open Search Technology
Web Meeting
12-14 October 2020
Web Meeting hosted by CERN, Geneva, Switzerland
Dear Experts, Supporters and Friends of Open Internet Search,
The 2nd international Open Search Symposium (OSSYM 2020) was held as a fully web-based, remote event, hosted by CERN. This event allowed us to dive even deeper into the societal, scientific and technical aspects of open and distributed internet search.
As you know the pandemic has tremendously boosted digital activities of individuals, society and economy as a whole. This continues to underpin how important it is to ensure that orientation in the digital sphere, as well as the access to digital knowledge, information and services via the internet remains neutral, open, democratic and in a privacy respecting way.
As you know, the development of a European Open Internet search community and infrastructure involves expertise from many different scientific and technical fields. It requires profound understanding of internet search technologies and new thinking for services and innovative applications, to be built on an open and distributed Internet search infrastructure in and for Europe.
The Third International Symposium on Open Search Technology will take place at CERN from 11-13 October 2021. More information will be published in due time.
The OSSYM conference contributions are published in the "Open Search Foundation Zenodo Community" at https://zenodo.org/communities/opensearch
On behalf of the symposium/organizing committee,
Dr. Andreas Wagner, CERN
Prof. Dr. Christian Guetl , TU Graz
Prof. Dr. Michael Granitzer, Univ. Passau
Dr. Stefan Voigt, open search foundation
-
-
Symposium Opening & Keynotes: Welcome & KeynotesConvener: Session Chair: Stefan Voigt (opensearchfoundation) (opensearchfoundation)
-
1
Symposium Opening & Keynotes: Welcome & KeynotesSpeaker: Chair: Stefan Voigt (opensearchfoundation)
-
2
Welcome and IntroductionSpeaker: Symposium Organizing Committee
-
3
Policy KeynoteSpeaker: Pearse O'Donohue (Director, Future Networks, DG CONNECT, European Commission)
-
4
Keynote: "Towards a Free and Open Internet Search Infrastructure: How Law and the legal experts can help?"Speaker: Olivia Tambou (Paris Dauphine University - PSL, France) (Paris Dauphine University - PSL, France)
-
12:05
Lunch Break
-
5
Keynote: "Searching, fast and slow - a tech perspective"Speaker: Arjen de Vries (Radboud Universiy, The Netherlands)
-
6
Keynote: "Legal Open Standard Design for Legal Search Features"Speaker: Monica Palmirani (University of Bologna, Italy) (Univ. Bolonia, Italy)
-
1
-
14:20
Break & Individual Networking
-
Plenary Session: “Open Search Ecosystems”Convener: (tba)
-
7
Plenary Session - “Open Search Ecosystems”Speaker: Chair: Michael Granitzer (University of Passau, Germany)
-
8
Keynote: "Keynote: "Beyond Tech: Raising Awareness For The Open Search Foundation Through A Tailor-Made Communication Approach"Speakers: Alexander Decker (Technische Hochschule Ingolstadt, Germany) (Open Search Foundation; Technische Hochschule Ingolstadt), Ms Cornelia Hiemer
-
9
"The CERN-Solid collaboration project"
Author: Maria Dimou
by Maria Dimou – CERN-Solid collaboration manager
current data thanks to the CERN-Solid development experts.The Web was invented at CERN by Sir Tim Berners-Lee. He defined it as a free, open and networked medium. Ever since Sir Tim went to MIT to create the World Wide Web Consortium (W3C), CERN developed a lot of important web-based applications. Still, as an organisation, CERN stayed away from the evolution of the Web, in terms of its standards and philosophy.
The management consideration was that the laboratory has to operate the LEP accelerator, prepare and build its successor, the LHC, basically to do physics. This was something also said to Sir Tim when he was programming the Delphi experiment RPCs, while the design and need of the web was clear in his mind.
In the first years of the web, when Mosaic and Netscape were still bleeding edge of web technology, in-house web development was still taking place in the, then called, CERN Web Office.
Student Heidi Schuster developed pinaweb (Personal Intelligent Newspaper Agent), a programme written in Java, guessing the user’s taste of web pages visited, creating profile per user and proposing most recent appearances on the web on the matters that interest the user subscribing to the pinaweb service. At that time manipulating via the web didn’t exist, so we found this a very clever and convenient application. Surveillance and intrusion were not yet terms we were conscious of.
Student Darius Kogut wrote Torch,a search engine understanding natural english language rather than keywords linked only with AND and OR operators. The development of this application was giving us intellectual satisfaction, as we were feeling that we were getting to grips with other disciplines, the understanding of rich human language by the search engine.
In the end, we did purchase the search engine Infoseek, later called Inktomi. They made us a good price offer, which we refused with the argument: "Your business would not exist, had the web not have been invented at CERN". It worked. The price was symbolic.
The above projects were approved because our evaluation of Lykos, Altavista and the like was leaving a lot to desire. Also because the time was such that companies had not yet made money out of offering, withholding, manipulating information on the web. Google didn't exist yet. The search results one was getting were probably irrelevant or incomplete, still they were what existed and not what the engine would like to show the users according to its estimation of what is appropriate/relevant/desirable for them.
By 2001 all these creative activities ended. Commercial solutions were adopted for the web matters of the lab.
It is true, the CERN IT Department has to support computing applications and ensure smooth operation, and for this it is appreciated by the lab.
The years coming up to the LHC were critical because the network and storage needs were unprecedented, so we were not sure the technology would make the quantum leap before we actually needed it. Luckily it did. Today's CERN experimental data are massive, still Google and facebook stole the first position and store even more.
Innovative and pioneer work started to be mostly expected to come from the physics arena. Technology transfer, especially usable by medical applications had first priority and most attention.
Still, there were several possibilities to link our computing developments with W3C standardisation work.
For example, in the area of the Worldwide LHC Computing Grid (WLCG) project and operations, where the use of the https protocol for data transfer and remote access to storage related naturally with the work done by the W3C Working Groups.
There are also design concepts in CERN applications, which can contribute good ideas in areas like the Data Catalogue Vocabulary, cross-service inter-operability and Authentication/Authorisation rules and restrictions.
Those CERN-W3C proposals remained without answer.
Then, Sir Tim Berners-Lee announced the Solid project in 2019. Private political or financial interests drove the web away from the principles he invented it for, namely universal, educational and free access to information. In Solid, standards will follow the principle of loose connection between Identity, Data and Applications. This will give the user control of his/her data.
At this point the climat was ripe for the CERN-Solid collaboration to be born this year.
The CERN development areas, where interesting exchanges can happen, possibly leading to selective adoption of ideas fed in / taken from the Solid specification effort are:
• The CERN push notifications project, aiming at a set of official channels that users can subscribe to, in order to receive news. The notifications are unilateral and will be archived.
• Indico, an event management open source platform, with 20 years of operational status.
• CS3MESH, a pan-European cross-institution mesh that will offer data sharing/co-editing facilities, relying on the federation of different sites by using well-known APIs.
• InvenioRDM, a Research Data Management, open source platform for persistent registration of research papers and data.
The common feature of these applications is the user Authentication needed for their workflows for restricted data.
Ideas are discussed in a dedicated CERN-Solid gitter channel. For now the exchanges focus on Web Access Control (WAC), as implemented in CERN applications. The design of Solid Access Control Lists is attractive, because it refers to users with URIs that can live anywhere on the web.At the time of this Abstract submission (June 2020) nothing is fixed in terms of co-development, because, in the CERN case, the data and the applications belong to CERN and not to the users, so the ACLs do exist but give no freedom to the users as to own them and decide where/how their data will be stored, indexed, accessed.
Still nothing is more useful in the area of web development than collaboration, awareness of possible malicious incentives by service providers and technical, ethical, ideological preservation of the web founding principles. Its birth place has the duty to contribute in this effort.
References:
[0] The web original proposal by Sir Tim Berners-Lee https://www.w3.org/History/1989/proposal.html
[1] Solid announcement in the press https://www.nytimes.com/2019/11/24/opinion/world-wide-web.html
[2] The CERN-Solid Indico category https://indico.cern.ch/category/11962/
[3] The Solid project web site https://solidproject.org
[4] The CERN Web Office (most data missing today) https://weboffice.web.cern.ch/WebOffice/
[5] The CERN Torch search engine http://cern.ch/dimou/SApaper.html#torch
[6] CERN-W3C 2014 proposal https://cern.ch/dimou/personal/CERN-W3C_Collaboration.pdf
[7] CERN-W3C 2017 proposal https://cern.ch/dimou/personal/CERN-W3C_Collaboration_2017_proposal.pdf
[8] Push notifications proposal in 2003 ttps://cern.ch/dimou/it-us/zephyr.shtml
[9] Push notification proposal in 2020 https://codimd.web.cern.ch/p/ry5_j4r2U#/
[10] Linked Data Notifications: https://www.w3.org/TR/ldn/
[11] The WebSocket Protocol: https://tools.ietf.org/html/rfc6455
[12] Indico https://getindico.io/
[13] The Road to the new CERN Identification https://auth.docs.cern.ch/whitepapers/the-road-to-new-auth/
[14] CS3 MESH https://silo2.sciencedata.dk/sites/cs3mesh4eosc/
[15] InvenioRDM https://inveniosoftware.org/Speaker: Maria Dimou (CERN) -
10
"Ecosia - Fair competition in the market of web search engines"Speaker: Wolfgang Oels (Ecosia)
- 11
-
12
Summary and Wrap-Up Day 1Speaker: Stefan Voigt (opensearchfoundation)
-
7
-
-
-
Plenary Session: "The Open Web Index"Convener: (tba)
-
13
Opening Day 2 and Results of Day 1Speaker: Andreas Wagner (CERN)
-
14
Plenary Session: "The Open Web Index"Speaker: Chair: Wolf-Tilo Balke (University of Braunschweig, Germany)
-
15
Keynote: "Towards an Open Web Index: Lessons From the Past"Speaker: Michael Völske (Bauhaus-Universität Weimar, Germany) (Bauhaus-Universität Weimar)
-
16
"Geolocated Learning Environments and CapacityBuilding for tailored support in the context of an Open Web Index"Speaker: Melanie Platz (Pedagogical University of Tyrol, Austria)
-
17
"Experiments using a Distributed Web Crawler to Process and Index Web Archives"Speaker: Sebastian Nagel (Common Crawl)
-
18
"Open Search Use Cases for Improving Information Discovery and Information Retrieval In Large and Highly Connected Organizations"Speaker: Igor Jakovljevic (CERN & Graz University of Technology, Austria)
-
19
"Discovery of Software Innovations using Repository Mining"Speaker: Tobias Hecking (German Aerospace Center)
-
13
-
12:35
Lunch Break
-
Working Groups (parallel sessions)Convener: Chair: Stefan Voigt (opensearchfoundation)
-
20
Introduction Working GroupsSpeaker: Chair: Kai Erenli (University of Applied Sciences BFI Vienna, Austria)
-
21
Applications Working GroupSpeaker: Christian Guetl (Graz University of Technology, Austria)
-
22
Awareness Working Group:Speaker: Alexander Decker (Open Search Foundation)
-
23
Economy Working GroupSpeaker: Olivier Blanchard (Open Search Foundation)
-
24
Ethics Working GroupSpeakers: Christine Plote (Open Search Foundation) (Open Search Foundation e.V.), Anton Frank ( Leibniz Supercomputing Centre (LRZ), Germany) ( Leibniz Supercomputing Centre (LRZ), Germany)
-
25
Legal Working GroupSpeaker: Christian Geminn
-
26
Tech Working Group: Tech/Tech ExperimentsSpeakers: Michael Granitzer (University of Passau, Germany), Dr Stefan Voigt (Open Search Foundation)
-
20
-
15:00
Coffee break
-
Summary Working Groups & Wrap-up of Day 2Convener: Chair: Kai Erenli (University of Applied Sciences BFI Vienna, Austria)
-
Virtual "Cocktails” and Networking
-
-
-
Plenary Session - “Infrastructure, architecture and beyond”Convener: Chair: Maria Dimou (CERN)
-
27
Opening Day 3 and Results of Day 2Speaker: Christian Guetl (Graz University of Technology, Austria)
-
28
Plenary Session - “Infrastructure, architecture and beyond”Speaker: Chair: Maria Dimou (CERN)
-
29
Keynote: "First thoughts on a data lake architecture for an open search infrastructure"Speaker: Leon Martin (University of Bamberg, Germany) et al
-
30
Crypto-securities and the lex rei sitae ruleSpeaker: Sara Sánchez Fernández (IE Law School, IE University, Spain)
-
31
Reducing Misinformation in Query AutocompletionsSpeaker: Djoerd Hiemstra (Radboud University, The Netherlands)
-
32
Improving Open Web Index’s Transparency by Leveraging bloxberg’s Blockchain TechnologySpeaker: Iosif Peterfi (Max Planck Digital Library, Germany)
-
33
Poster PitchesSpeaker: Chair: Maria Dimou (CERN)
-
27
-
Poster Session
- 34
-
35
Legal and ethical considerations on opacity of algorithms in Europe, focused on Boosting an open Europe search engineSpeaker: José Antonio Parrilla (University of the Basque Country, Spain)
-
36
MFAing the Quantum Crypto Proof Mobile AppsSpeaker: Miika Tuisku (CSC IT Center for Science, Finland)
-
37
Towards Graph Assisted Query Expansion and Graph Navi-Gation Based Relevance Feedback in Open Search SettingsSpeaker: Aleksandar Bobić (CERN & Graz University of Technology, Austria) et al
-
38
Web Archive Analytics: Infrastructure & Applications @ WebisSpeaker: Michael Völske (Bauhaus-Universität Weimar, Germany) et al
-
39
WEBVR - Making Educational Content a Public ExperienceSpeaker: Marc Bastian Rieger (University of Koblenz and Landau, Landau, Germany) et al
-
13:00
Lunch Break
-
Plenary Session - "Towards an European Open Search Infrastructure"Convener: Chair: Stefan Voigt (Open Search Foundation )
-
40
Plenary Session - "Towards an European Open Search Infrastructure"Speaker: Chair: Stefan Voigt (Open Search Foundation)
-
41
Interdisciplinary, interactive Session and next steps
-
42
Open Search Experiments
-
43
Open Search Working Groups (Tech, Legal, Ethics, Economy, Awareness, etc)
-
44
Any other Business
-
40
-
Symposium Wrap-up, way ahead and closing remarks, Outlook OSSYM 2021Conveners: Andreas Wagner (CERN), Stefan Voigt (Open Search Foundation)
-