Speakers: Mitzi László , Sarven Capadisli, , Maria Dimou, Pedro Ferreira, Lars Nielsen, Jakub (Kuba) Moscicki, Hugo González LabradorGroup management members: José Benito Gonzales, Thomas Baron, Eduardo Alvarez Fernandez (for Andreas Wagner), Tim Smith
Maria's post to the group about Solid of 2019/01/20 in mattermost here:
Sir Tim Berners Lee's most recent project Solid, in an extract from a newspaper article: It's effectively a new web; The key change is to do with data. On Sir Tim's original web, users' data was - and is - stored by the owner of the website or the app. On Solid, the choice of where you put your data is separate from your choice of service. Your data - from your selfies to the money you send - is hived off into a separate area, called a pod, which can be linked to, just like the pages on a website. That gives people genuine control over where and how their data is deployed. If it comes off, it would be a seismic change in the digital landscape. "Some people are calling it Web 3.0," Sir Tim says. And whereas previous attempts at what's known as re-decentralisation have foundered on public disinterest, this time Sir Tim feels the time is ripe.
Mitzi, Maria, Sarven, Thomas, José, Pedro, Adrian, Eduardo, Lars and
Hugo, Kuba, Tim (partially).
Apologies: Andreas, Michal.
Notes by Maria with comments by Sarven.
a. The Solid project core idea of loose-coupling of identity, identification, authentication, authorization, data and user interfaces (applications) is attractive for all Web applications.
b. The CERN use cases are currently not in the focus of Solid, still they are in its scope. The dokieli example shows similarity, even complementarity of requirements. CERN developers felt some time is necessary for the Solid specifications to be complete and the interaction method between developers to be clear for handing them over and receiving contributions. Sarven suggests to take individual notions under the Solid umbrella, that can be adopted by CERN projects, and evolve the specifications based on the problems encountered and documented. The area of notifications is worth exploring further.
c. Maria, as CERN-Solid collaboration manager, will maintain the interface between Solid and CERN developers, as point b above becomes clear. This exchange/collaboration exercise will naturally need a period of discovery and adaptation. Once the content of another f2f meeting is defined, it will be scheduled - probably in Q3 2020 .
Details on the presentations (slides linked from the agenda) and the discussions that followed. All the links to the applications explained here are on the agenda and listed on the CERN-Solid category index page.
Problems of the web today (Mitzi)
Feedback from the participants on problems of the web today:
Solutions proposed and their challenges
Mitzi asks whether Indico complies with the Solid specs. The answer is 'no'. Sarven says the Indico implementation is very close anyway.
Research Data Management platform (Lars)
Archive, disseminate, re-use of Research Data. Warranty of data preservation is hard to achieve. Trying to facilitate uploads. CERN's Open Data is not 'open' from the beginning of data production for reasons of risk of misinterpretation, need to publish first, high price of the experiments, hence need to be the first to discover and write about it... Now there is competition on all fronts, e.g. big publishers e.g. Elsevier getting into play for cloud storage solutions, usage statistics and data analytics. Google Dataset Search compiles data and makes them available for citation counts, universities' rating etc.
The CERN team is good at running the infrastructure. The actual data, e.g. index of all species, is the responsibility of the data owners. Data sharing culture is a challenge. Data replication is needed because governments and institutes want to have what they own at their premises.
Researchers find Zenodo uploading much more functional to use than the ones publishers offer. Publishers retain info until you pay. This totally against the principle of flow and set to proof trial of scientific information. Science can be preserved when it is made available for trial and verification.
Mitzi reminds of the european health-related data not being actually hosted in Europe! The data for which enormous amount of public health money is invested, not being available to their own countries is an exposure to danger.
The issue of trust comes up in every technical and policy aspects of every project.
CS3 Mesh project (Kuba)
Metadata-awareness is very present in european policy making bodies. EU funds were received for this project. Mitzi says that WeTransfer (dutch company) are now expanding in functionality beyond file transfer and they do comply to standards. Maybe the project could invite them to the next event to find out what they are doing. They can't be part of the project because the partners are set and because WeTransfer is a company, not an Open Source solution. Still as a proof of concept they can be considered. Similarly Dropbox is used by many users because their organisation decided to adopt this solution. The CS3 Mesh project won't leave these users out. Future maintenance is also a concern.
Kuba asks what can Solid specs and standards do for the CS3Mesh upcoming protocols and APIs.
Dokeli as a Solid application (Sarven)
A clientside editor for decentralised article publishing, annotations and social interactions.
Solid is domain-agnostic. Users are allowed to choose their own Identifier. Different data types and content models e.g. RDF (Resource Description Framework).
WebID is a dereferenceable HTTP URI denoting an agent (person, group, software).
OIDC (Open Id Connect) is an authentication mechanism. WebID and OIDC can be loosely coupled: WebID+OIDC. Ditto WebID+TLS. An orcid ID can act as a WebID.
Discussion followed around the incentive to join the Solid community. Developers adhere because it is an open solution. All contribute towards the "for everyone" mission of Solid, regardless of their background. The community creates specifications and other material, based on consensus, through open discussion and participation.
Decoupling Identification from Storage from Apps can be attractive. We have to understand which part of the method, is there to use.
Solid has a standard notification system that can be used by existing application, e.g. Zenodo or Indico. E.g. sending RSVP to both applications which are different just using the same standard. This would be a good way to demonstrate communication across systems.
The user's WebID eg. https://csarven.ca/#i refers to different preferred storage locations. Then when saving s|he is prompted with >1 options to choose.
If one annotates a part of a reseach article, the content of the annotation is stored in one's preferred storage and a notification is sent (optionally) to the article's site.
If an annotation is removed then this can be a problem for others who made a reference to it. However, this can be mitigated by linking to their (archived) persistent resources.
The github solid process panels and contributors advances via chats between the panelists. The editor of the panel is appointed by Sir Tim. There is a possibility to vote. This wasn't necessary so far because the panelists are few. For the moment the 6 editors have 4 different affiliations. Three of them come from inrupt.com so there is no interoperability problem. This US-based company is for-profit. Lars says why / who will justify the motives and the ethics. Sir Tim Berners-Lee co-founded inrupt and acts as its benevolent Director, to show that a company can be ethical. Still there are decisions which are opaque to its members.
There are different companies participating in the Solid project, e.g. Openlink, offering technical solutions similar to Solid, e.g. RDF. All these participants work together without competing over people's data or privacy.
Jose says that having pods in zenodo that move around gigabytes of data is hard to do. Also for WebIds we don't have user owners but institutions. On this point Sarven commented that this is perfectly normal. We can have multiple WebIDs and we can choose to link any of them together (whether any are pseudo-anonymous or not) or keep them off the "graph" - be unlinkable. Quick summary here.
Lars talks about Next Generation Repositories' framework (NGR). Statistics or views and downloads is a must for zenodo because the users require it.
Tim S. says that we would like to find out more about how we could collaborate.
Sarven says that one of the Solid objectives is for each to have a persona online profile and still being able to share events with others.
Lars says that moving data around between entities (institutions) can be interesting but it is not clear that Solid makes it easier and faster. NGR takes care of inter-operability OIA/OIE standards ... Jose says that Solid could join the Coalition of Open Access Repositories (COAR). Sarven comments that the Solid approach is complementary.
Hugo says the use cases we have are not covered by Solid. The examples shown by Sarven are good for end users but not service providers. The end users of our applications are not going to type URLs and run their own servers. Sarven comments that other Solid authentication/authorisation workflows exist, that require just a click.
What makes one Solid-compliant? Is the use of WebID enough? Sarven says that the possibility to read/write anonymously should make the WebID not a requirement. He reminds that, as, in Solid,the components like identification, authentication, authorization, and storage are loosely-coupled, services and applications can be configured or arranged in different ways.
Mitzi shows datatransferproject.dev (repo https://github.com/google/data-transfer-project) where the giant companies participate. Maria says the question is what they do with the data once they agree to mutually transfer it and why an owner of an application like Zenodo cares about this?
Lars says that he sees as today's outcome to 'get into the practice to be looking at the Solid specs to see what can be relevant to our use cases.'
Thomas: Nice concepts e.g. decoupling AuTH/AuthZ from app and from data. We already have this concept, also for notifications. More Solid details will be very useful and necessary for feeding ideas to these on-going developments at CERN.
Mitzi asks for links to these existing on-going CERN initiatives. Maria will get relevant links from Thomas and forward. Here they are:
Pedro says there is a lot of potential in the idea of linking data with the semantic web. There is quite some way to go until the solutions are made available, performant and easy to use for the average end user. The meetulator app is very interesting.
Eduardo asks for a clear set of the Solid components.
Maria thanks everyone for the day they devoted to this brainstorming. Understanding how busy CERN developers are, she, as CERN-Solid collaboration manager will make available the Solid process tools, development repositories and chat fora and propose a functional way to channel Solid-CERN information. CERN IT developers will pick-up, adapt, adopt, complete, comment, when appropriate. This exchange/collaboration exercise will naturally need a period of discovery and adaptation.
Solid developer Michiel de Jong mentioned to Sarven, after the meeting, that the new CERN SSO becoming a WebID-OIDC provider, for use by Indico would be very interesting. The CS3MESH project would also profit from WebID-OIDC.
The email@example.com e-group was created for future exchanges, with members today's CERN participants. Mitzi and Sarven are eligible to post to this e-group.