yes so that means thank you very much before we start I would like to introduce to the people who couldn't join, when we did our first and previous update in January Jan Schill who did a phenomenal amount of work in developing these proof of concepts for joining Solid concepts with the Indico application i would like to thank his professor Philippe Bonnet, who joins us today, despite his busy schedule I would like to thank our department head Frederic Hemmer who joins as well thank you very much and my supervisors Thomas and Tim and web experts Eduardo, Andreas CERNBox experts Solid expert Michiel so i think this will be a very nice opportunity to get insights and advice from all of you thank you I shall share a screen now start with the presentation, I will not be able to see the chat but Jan will shout if something is wrong with the slides or the sound. so please I hope you can see the slides the blue slides thank you so as I just said the Jan and i work together This particular development that we explain today started after Christmas but reference to his previous work between mid-september and december will appear with links in one of our slides. So what we will speak about today is a very short reminder about the Solid philosophy and popular terms for those of you who couldn't join the previous event, and the specifics of what this proof of concept is about namely enhance Indico events with comments with authentication done via Indico and with the comments physically living in Solid universe and taking registration for people joining Indico conferences from Solid data. I am mentioning there pod data, the definitions are coming up so one slide about what is Solid. it stands for SOcial LInked Data, it was born in two thousand sixteen Tim Berners-Lee initiated the project the idea was that people's data live where the owner of the data wants them to be stored and the owner of the data sets access rights to chunks of those to different viewers, exploiters. It combines existing standards from the web consortium and it is really built on the existing web, you will see that every subsequent term corresponds to web pages. The Solid pod: pod is an english word i didn't know in two thousand nineteen sorry standing as a protective container and in fact the Solid pod is exactly that, it's a place on the web where one can store their own data this data can be whatever, personal information and pictures and movies and documents and whatever and they are stored as Linked Data and Jan will explain the syntax of Linked Data and you can see here a few examples of how a person who has a pod appears on the web with a WebID namely the the way to uniquely identify one's pod so you have here some examples for Jan, for me Tim Berners-Lee but now Pedro, Thomas, Tim, have also pods and you are all welcome we have the recommendations on how to do that in the links later on. A Solid server is a web server where users' pods are stored and where there is an interface with the appropriate logic for you to decide for this particular information my pictures i allow access to Jan Schill only for my documents concerning my internal group IT CDA I allow access to Thomas Baron Tim Smith only etcetera we have a list and we have gone a long way about existing Solid server implementations on the twenty fifth of january and I we have left them in the appendix just to demonstrate that despite the fact that the standard is young there are four and increasing five...ish implementations already up and running with their weaknesses but still very active. So to summarize the previous story about the ownership of data and deciding on access etcetera on it this nice picture shows exactly what Solid would like to stop, to avoid, to rescue us from that we have user logins and passwords for all imaginable applications that they may share our data without asking us and stuff like this. We are not there yet because none of these popular applications say please give me your Solid pod and I will not ask you any further but it shall happen if we believe in it. So our project started in september and it wouldn't have completed in this happy way had Jan not done this phenomenal amount of work. At the beginning e we studied and tried to understand everything about the specifications Michiel de Jong, connected here today, helped us a lot in interpreting the documents and the chats which are thriving in gitter but still needed some help to understand them, then to understand the implementations namely what I just said about pod providers Solid servers etcetera and then these two modules that Jan will explain in detail for enriching Indico, based on the solid principles. In his thesis which is being completed tonight he also makes recommendations, that we've discussed, about how to go from here and we shall touch upon them later in this presentation. So concerning everything about Solid specifications and implementations this was done in the autumn it was presented in january and for those who couldn't be here with us you have the slides and you have a detailed report linked from here that is really worth reading. For the actual proof of concept I stop now, I continue going on with the slides but Jan is going to continue thank you thank you yes next slide please so this is the the first module that I have developed so this is a screenshot of the user interface of an Indico event and this is the comments' section so before, Indico did not have any functionality to allow comments and now only in the proof of concept there is a functionality where you can see in the top right there is an input field where you can put your WebID in and then press the log-in button it will connect you to your external identity provider and then you can authenticate with it and give access to your data pods for this module. This will then allow you to post comments and these comments all that you can see here they are not stored in Indico, they are all stored in theirs or the author's data pods. Indico only holds a reference to these comments because otherwise indico wouldn't know what to load but the actual comment is stored decentralised away from indico if you go to next here you can then actually see the user interface of the data pods and you can see there's a folder or container and it contains another container and then there's actual files in ttl, that's the ttl that's the Linked Data format that is being used in the Solid ecosystem to to store and describe data next slide some details on the implementation for this particular module it is a completely developed client site meaning that it runs in the browser I browse to the indico event page and then the module is being loaded and initialised and everything happens in the browser so apart from the actual storage and the extra end point that is needed to store the reference in Indico everything is on the client site it is a self contained application so in theory it could be re-used, I tried to build it as self-contained as possible so that it is not it doesn't have any Indico specific design so it could be used maybe even in some applications that run on the Solid pod directly in a check or something like this it stores, as shown before, one comment and one file on the data pod this is not particularly efficient because every file needs to be fetched but with one request there's some improvements for performance improvements described in the thesis but also it could be stored all in one file all the comments but for the proof of concept we really wanted to make it as simple as possible that's why it's designed in this way the module also communicates directly with the data point that means all the clients that browse to the Indico event page will need to connect to the data pods directly meaning that one client will probably make ten requests for ten comments and connect to ten different Solid servers It also needs an authenticated indico session, we decided to do this just to mitigate spam, so people not just come along and post comments without really being associated with indico or with CERN and then it also then, as I said, Indico holds that reference to this comment so it's just the URL that is publicly available in Indico then browse to it or the module and then note it. Next slide please. The next module is a little bit smaller this is the conference registration user interface for those that don't know so Indico can also be used to facilitate conferences and if I want to participate in this conference I need to register and the module here is just the upper part where there is another input field and a button indicating if I want to reuse some data that I have stored in my data pod and I don't want to provide my name again and type it all in I can just provide the URL pointing to the WebID profile documents which is first of all the WebID being my globally unique identifier in the web now also stores a document that is when I de-reference the URL is being fetched and it has a lot of information that I provide that I can provide for example my name my email address my affiliation so by providing the URL on the top and pressing the auto-complete the module will do a, do a fetch, get the document that is in Linked Data format and then it will map somehow to these input fields and then populate them and if there is two values that are possible for example I already provided my name but in my WebID there is a different name then it would ask which one I want to keep yes next slide here is the management overview of an Indico conference and it just shows the people that is or the users that have submitted a conference registration that tested the conference registration module and actually also pressed register yes and this is a very brief overview of how the mapping looks because in Solid everything is in Linked Data that means it is using a subject-predicate object triplet to describe the data that it has this allows interoperability so I have one server implementation and another one and then I can migrate in theory my data between those two and applications can also cleverly make use of this data kind of like Indico did now and the subject is me the predicate is then using vocabulary that is public for example schema.org maybe some notice and in this case it's the V-card vocabulary that is being used for the full name not the first name here in the predicates and then the object is my name and then in the Indico form because it doesn't use any indicators for Linked Data I cannot do a direct mapping, I had to look at different fields or attributes are in the html code and one popular field is the name attribute it is always provided in input field and often it also carries some kind of indicator what data is being expected so in this case I could just use the first name and then map it to the vocabulary that I have, so I built somewhat of a dictionary and then do the mapping this is not always possible because indico also allows dynamic fields so me as a conference administrator I can create a register form that does not that is completely random or completely dynamic completely made up so gender for example this is not a field that Indico I think has pre-defined so I can get a text field say in the label this is the gender for some reason I want to know it and then my module looks at the label of the html and then kind of makes a guess what kind of data it is this is not ideal because if all of a sudden the language of the event is not english anymore and it switches to a different language this mapping wouldn't work so I have different levels of the mapping and the last one is the label but ideally we would want to have Linked Data straight into the html so we can make better guesses yes next some details to the to the implementation so the design of the module is to retrieve personal information or any information that maybe could be used for the registration for an Indico conference from the data pod the original idea was because it is all about storing data decentralised was to use the registration form and take the data that the user provides in the registration form and then put it on the data pod. This is much more interesting I would say but it was abandoned for this proof of concept due to several reasons and one is sensitive payment details that have not now been introduced but are definitely possible in the conference registrations are so sensitive that they really need reliable data retrieval and this might not always be possible because the users are in control of the data so they can change the data whenever they want they could sign up with one set of data or information and then change it to another set and this makes it much more complicated but also very interesting Indico also allows the archival of events so for archival reasons at some point it needs to have all the data, save it up this might not be possible if a data pod all of a sudden is not available anymore or if the person changed the data or removed some part of the data so there is also ideas that are all presented in the thesis but for these reasons we abandoned the idea originally and then went to the other one and then also an interesting part that I've not mentioned or briefly mentioned in the comments the decentralised stored data need to be fetched and if all these information and all these users use different data pods in different locations all these if all this information needs to be fetched and that means with a conference of two hundred people I would in the worst case and this is always the case that we need to be thinking is that I would do two hundred requests if I want to manage the conference and just look who who signed up so because of these reasons we abandoned the first idea but then came up with the other one thank you thank you thank you very much Jan in fact we did discuss a lot and got advice from Adrian on indico, from Philippe his professor on the implementation approach, from Michiel with the Solid insight thanks a lot this work the technical work of Jan has been phenomenal So I would like to mention now what is the situation lucid view on Solid today. It is true that there are few applications that use Solid pods so far the jungle of applications and those applications that can abuse of our data à la facebook and the rest don't use Solid pods, they don't recognise them, don't interface to them unfortunately this is the situation today. Also we have to make a recommendation you can see it in the policy document about where to get a pod now for doing actually the experimentation with the proof of concept and those of us like Jan and myself who had to manipulate a lot our pods we were very disappointed from the existing antique use and user interface but we have entered several issues in github for that to be improved we didn't see throughout these months by attending the monthly webinars called Solid pod eh Solid World for which you will see all the links later on we didn't see enough support for the open source solutions like everything else in open source it's just passionate and enthusiastic developers because Solid is a standard and because various needs come up as it gets more used there are adaptations of the specifications, there are adaptations of the approach towards access control implementations start to deviate and this has an impact on the test suite which fifteenth of december was in a perfect state of a ninety eight plus a success or of the test suite results and then proprietary implementation started to get an approach that matches their customers' needs and we have discussions in the in gitter concerning these issues so all of these are truly honestly and exhaustively the imperfections of Solid. Nevertheless despite these challenges we see that there are governments that embrace Solid and sign official agreements for hosting all public data of their administrations tax offices or in Belgium or National Health System in the UK etcetera in Solid pods. this will be an incentive for the implementations to become more efficient and better quality and for the interfaces to become more modern every month in the Solid World there are at least four companies, we are fighting there to get a slot to speak which are startups that they come all in to present with slogans like "Solid is the future our company leads the way" and it would be a pain and a disappointment that CERN the birthplace of the web is not going to be embarking in this adventure, noble adventure, early. In the gitter chat as I said not everybody is active but there are thousands of members and we have not attributed huge resources no resources Jan has done this free of charge gratis for his own thesis but it is strategically and ideologically important for CERN to be engaged with Solid. This is our opinion and therefore as a conclusion for all these reasons we would like to say that for the moment we have our pods on a NSS flavour server, this is open source this is the original implementation from MIT, we have a very prominent prométant, how do you see this well promising server implementation coming up, the Community Solid Server embraced and sponsored by Tim Berners-Lee and his company Inrupt it is from the university of flanders it is open source we could integrate it with CERN SSO, it would need development we could get all these fantastic expertise we have at CERN in the Web Frameworks to write an enviable UI because it comes without storage without ID provider and without UI you can have your own or you can pick one from two available ones in the open source domain or we could investigate the usage of CERNBox as a Solid server through work that has been done between Solid and Nextcloud and chief developer of this endeavour Michiel de Jong, who is with us today a Solid expert and to understand the alternatives and continue debating because this is not conclusive but it is just recommendations on how to go forward you can look at our policy document which is linked from here. After concluding on this I just wanted to show you everything about the available Solid servers, NSS is what we use, CSS is what we recommend for the future, ESS is gonna be probably very professional but we don't recommend it because it's closed source, because it has US-based storage et cetera, we don't want this we think and there are others that are very promising php Solid server will be integrated with SolidOS on which Tim Berners-Lee actively does development but this is to be followed up it's not in the in a shape that we can use it as of this summer. So having said that all the names that appear here I have already mentioned thanks very much to Jan for this brilliant development, to Adrian for advice to Pedro for support and original suggestions to Tim Berners-Lee who always was available to give insight and advice, to Michiel, who explained the mysteries of Solid to us, to Ruben who is the chief development project leader for the CSS, the one we want to migrate to and to my leaders Thomas and Tim for approving this work. We, you have all the references here please look them up they are brief they're full of content and thank you very much maybe stop sharing so that I can see you thank you there's a few questions in the chat that i would like to address yes please Jan go ahead, I haven't... The first one how do we identify users trusting an external Identity Provider that is two questions or one indicating the first one, for answering the first one as I mentioned briefly so Solid decouples authentication data and application which means that in theory and as we also have in our recommendation CERN could use its CERN authentication or authorization service to to implement a Solid solution at eh at CERN and then use already the existing authentication service that they have, so so no need to even trust any external identity providers but as of now with the Node Solid Server (NSS) for example it is implementing an Identity Provider through the usage of Solid OpenID connect which is just a flavour of OpenID connect. yes so in the second case the registration form data is stored in Indico, right, and Indico doesn't keep a reference to the profile card sorry you explain that in the second slide yes yes yes no problem but yet this was definitely the original idea that we wanted to store the the registration data in the data pod thank you a lot of browser extensions and I think the question in the end is how to protect against fake posts tracking user activity. This is a really good question and it is also addressed in the thesis, so one solution could be that Indico or that a proxy is developed that would instead of making all the requests in the browser on the client the request would happen in Indico on the server so a server implementation would be needed and then the Indico instance does all the all the requests to all the different pods and that would also mean a performance improvement because we could cache all the results the responses that we would get and kind of use cache warming to to keep the cache up to date but also to make this request before even clients ask for the data and then with one request to Indico the data could be provided so in that sense indico would shield the fake pods and by that and thus protect the its users and clients but yeah definitely with the current solution definitely a possibility to to track IP addresses of the clients connecting to to the data pods and then i'm struggling to think of use case where the cost of fetching multiple user data is low enough for solid pods to be worth it on any examples. I would say all the examples said do not require really a lot of data from other people but only my data for example a very trivial example a "to do" application I could just store my data in my pod and then there would be maybe only one request to fetch the resource and then I can have with one request all my data and then do all my application work very performant and very fast and then mentioning with the with the proxy that could be a performance improvement to mitigate all the requests that have to be done. Another question just put up a question on similar lines is for instance can you always trust the data you get from the pod? you would probably need sanitization mechanisms which most servers don't have for data stored usually on the server side. Yes also very good question also addressed in the in the thesis when for example in the comments module an adversary is writing a cross-site scripting attack which is just a javascript code that would be executed by the browser if rendered there would be definitely sanitization needed and it's also implemented in the comment module but this is definitely something that needs to be thought about when developing new Solid applications that data cannot always be trusted and it needs to be very carefully thought about when developing these new new Solid applications when the data is not at hand anymore. Very good questions. Thank you Jan very much Of course. Concerning your Hannah's question: if i may add something with a question mark. When Hannah wrote what would be the actual use case that would be worth the effort: How about all the CERN community, today tens of thousands of users, may be getting a pod in any of the proposed architectures to be debated further and all of those documents which are not to be put for example in CERNBox which is the official dropbox for CERN were to be stored in the people's pods and that would be something like CERN offering its users like a personal website in the Solid terms and access permissions for example today my website cern.ch/maria it lives in afs I had to read to register a webserver for that, maybe with using pods this whole process will be easier, provided of course that pod management will be easier which is not the case today from the experience we had with the existing UI but I mean the idea does it make sense Thanks that's that's much clearer I think I was a bit thrown because the entire concept of Solid seems to be about removing the need for things like facebook to store data and those kind of platforms are exactly the platforms where there are thousands of users which would necessitate thousands of fetch requests to all the different pods so I don't see how it works in in the facebook kind of example but I completely get that that CERNBox and also NHS for medical records that kind of thing that does make a lot of sense yeah yeah thank you for clarifying because for example in Jan's implementation of the Indico comments one pod owner had to allow Indico to access the pod and therefore then the advantage is that indeed you can put all kinds of things on your pod and you decide which other application is allowed to use it I see Bob also joined that's very nice Adrian also had a good point suggesting and I also we also looked at this for the case where we wanted to make the storage of data in the conference possible on the data pod is if we rely on data not to change or we we take versions and then sign them in Indico so we would hash the data with some kind of signature and then we would compare the data coming in so in that sense you could be versioning your data and in that sense make sure that the data is not updated when there can't be any updates anymore Right so concerning the even further future implementation the policy document that I mentioned earlier talks about how to get a CERN in-house Solid server with the various options and we have to take this offline that's for sure but just now that all of you experts of Web Frameworks, CERNBox, authentication / authorisation services would you sort of continue advising what is the optimal solution which is secure least disruption for the existing operational services and functional to continue from now? It would be very nice now that you have an idea of the technicalities that we we discuss offline other brainstormings on how to go from now If the management would like to make a closing note on whether the proof of concept's successful completion is a wrap-up or or is an assurance that it is worth remaining active in the Solid ecosystem it would be also nice oh thanks for for giving this opportunity Maria I would just I would like to to thank you very much as well as Jan for all the work you did Maria for the coordination and the all the discussion you set up among the Solid stakeholders and internally at CERN and Jan for the development. So I think we reach really the goal we, one of the goal definitely we fixed ourselves when starting the project which was really to understand ah better the Solid technology and what what it could bring to to our services so thanks a lot for that. As you mentioned in the presentation it also introduced some questions and you mentioned a few of them and there were questions about them one of the biggest hurdle when i tried the system was definitely with the managing the pod inside the server the solidcommunity.net server so I was wondering what are the prospects for getting a better user experience or a better experience when managing one's pod with the ongoing developments do you have timelines do you have more visibility on that? Who would like to answer would like to an... I can if you want I can say something briefly and then Jan will complete or correct me. The Solid server we recommended for the PoC duration, the stable one, Node Solid Server namely solidcommunity dot net we have very limited hope that it will get a radically modern user interface, although there are developers who wholeheartedly try to do fixes but they are at slow pace and the UI is so antique that it is not going to be better it needs a radical redesign with the recommendation of the future Solid server we want to encourage us using, CERN to use, the Community Solid Server we know nothing, it will probably most naturally inherit a UI like the one we have experience with, experience which is unsatisfactory this is why the participation of you all here Eduardo, Andreas, people with experience from the web services, Web Frameworks is very valuable because we could we have done with React UIs very fast and very good, we could make one of our own so it is to be discussed between us this is my input I don't know about Jan, Michiel also Solid expert, maybe you Michiel have seen other UIs we haven't used and they are enviably modern... well yeah there are a few UIs to browse the data on the pod the main important UIs, the apps that you would use so if you use media cracker to to keep track of which movies you like then it will start to data on your pod you never actually see the pod UI. Right yeah so the media cracker is one of the applications that I mentioned that they present themselves in the monthly Solid World and in the gitter there is a lot of activity and it's very popular and it has a good interface and it uses data living on a pod but all of this requires from us before we just jump from one alternative to the other that we do yet another evaluation, we have done a first proof of concept to understand exactly the Solid internals, the Solid status, the Solid technology, ecosystem and implement something. We know this is possible this worked thanks tool to Jan and the Indico experts. Now we have to iterate on something else, evaluating for example the the solutions for the UIs, evaluating the proposals in the policy documents for a CERN- based in-house Solid server and which flavour, all of these is a work to be done nevertheless we have to decide that we agree on the strategy to stay with Solid and do that and we shall see I'm very happy to lay down the details step by step of what is required to be done when and how thank you very much Maria and Michiel and Jan for the answers and again for the work you did ah yes definitely I think a listing proposing some use cases which would be very useful at CERN would offer support for for for the continuation of the project thank you very much Maria, you asked for some concluding reports ah words from me as well so yeah I'd like to thank you both for the contributions you've made to this evaluation I do think it's extremely important that we were involved as you know we had very early discussions as Solid was being formed about the need for it we were active in the communities that discuss data sovereignty in general and about how to address the need for more control over how data is used so we because we are active in all of these and because Solid seems to be well supported as a possible solution to many of the problems in one go it was important for us to be there at the beginning to actually help steer it and help give feedback at the beginning of how it might integrate with open-source solutions that we are developing so i think that has already been achieved we have got an open dialogue with with Tim and the other developers so i think what we wanted we have achieved and we've seen the limitations we also had the same questions like know Hannah put up about how could this possibly work and then we sort of worked with it to understand a little bit how it might work though the implementations aren't necessarily all there yet I think again in his thesis describes many things that could be done a heck of a lot better and so I think we should stay connected with it and I would like to propose the that we eh that we start to invest a little bit more perhaps in some of our applications to make sure that we stay aligned with it and possibly put up one of these servers like I said on top of our own cloud implementation so, but this is a proposal we have to do that in collaboration with the new management in the new work plan so that's what I will be bringing up as soon as i can. Thank you thank you very much everybody that that's very valuable input other comments Maria, if I may just a from the sideline and from Copenhagen a big thank you to CERN in general and to you in particular I think it's a it's a you know unique opportunity for Jan and for a student from my team to work with CERN and work with you on this type of project, Jan did a great work he still has a thesis to defend what we know has already a lot of good work as I just you know we would like to stay in touch you know iI my background is in database systems and the systems there is a set of issues that you that came up in Jan's work about a schema matching about the users of some you use the word 'guessing' but some some machine learning techniques that could be done to to much of the features from the different pods and they're set up of issues which are you know second which are not in the critical path but which might be in the future and as a university these are some of the things we can look at Fantastic! I look very much forward to further collaboration Philippe. thank you that will be very nice thank you ok so with this maybe I thank everybody, really grateful for whole for staying that long definitely everything is possible offline the gitter channel is in the references, join and comment there it will be very nice to keep this alive and not be only Jan and me who communicate on this. thank you very much thank you thank you