Q&A ============ BC=B.CHRISTENSEN-DALSGAARD HS=H.STEWART JS=J.SANCHEZ WJ=W.JANSEN ============ DONATELLA: 2nd year overview -ghaphs on M20-24 confusing, not "clear message", in addition confusing with integration activity- -reviewers docs should have been distributed to all partners to be prepared on questions- JS: sl.6, can you clarify the difference bw the services here and in the next slides? r: in this slide I refer to the web services that refers to the services diligent implements JS-BC: clarification on what is represented in the charts about the status r: 48 services in total/46 are currently deploied in our infrastructure, we will see this in detail later on JS: sl.5 I understand you have done a sort of risk analysis. None of those presented in your slide are still a risk for the 3rd year of the project? r: No HS: can we consired as a statemet from you that gLite is stable enough ? r: is still a changing technology but we have done a number of exerimentations and we are currently exploiting a number of gLite services for the implementation of our services, this is currently working and you will be able to see the details of some examples in the posters HS: gLite 3 or previous ? r: gLite 3.0.2 HS: I would like to see more detail about the infrastructure (servers, machines location, etc.) r: will come BS: feedback to technology domains. what kind of feedback ? these activities are very different in nature. r: e.g. DELOS, experience on large federated architecture BS: CASPAR ? r: this is a project that potentially could exploit DIL technology for knowledge preservation BS: I would have expected to see BRICKS at the top of your list r: I mentioned some examples, those where there are closer collaborations. We have exchanges with bricks, but not close collaboration as those mentioned JS: I would like to see deviation and underspending in management presentation later on. As well situation with partners closing and lefting the project. Exploitation: how will you handle the sw that will be produced by the project? I want to see how the academic community will benefit from the result of the project. JS: why the DIL-EGEE MoU came so late ? r: process started one year ago, the real collaboration started actually at the beginning of the project lifetime. The document just reflected the situation and requirements status at the time the discussion took place and then was submitted to PEM/PMB revision and approval. It represent a formalisatin of what was alredy in place. ============ LINO: technical overview 2nd year - glite exploitation: misleading what done what planned - BC: sl.5 clarification on status of build, integration testing, functional testing. Not clear what components are really working r: DILIGENT infrastructure is available and services are exploited by the user scenarious and exploit EGEE PPS Nevertheless not all of them have "passed through" our release procedure. currently only 85% of them are officially integrated. real-time demo on infrastructure BC: surprise you mentioned the number of pages you have produced. but what about the testing activity? what is your bugs rate? how many lines of code developed ? r: explanation about savane and etics, we are waiting for etics to provide us with metrics demo on etics interface BC: this is not quality assurance. how do you ensure quality and how do you manage the progresses ? r: etics demo on build/test. all these components must build and deploy automatically remotely. therefore we have to test all the dependences in order to deploy remotely will continue over lunch JS: deadline r: november tool for functional and sys testing will be delivered by end of december the alpha protoype will be officially delivered 100% build should be achieved in one or two more integration build JS: the worst period for these projects is the integration period, I'm worried you will cumulate more delay r: that is why we decided to use savane and etics JS: we are worring that the integration process will add further delays, this is a difficult phase for this kind of projects r: we are confident from the functional poit of view because we are already using the components though the portals JS: future plans ? r: explanation about beta (june 2007) and final release (october 2007) JS: you mentioned PPS, how much storage, how many other nodes run gLite (a apart from the PPS sites?) r: details baout DILIGENT development and testing gLite infrastructures JS: how many bugs did you reported to EGEE ? r: face to face meetings + savanne (around 50) JS: planned to integrate to EGEE production ? r: not yet, we are testing procedures in PPS for the time being to be "ready" to join the production later on ================ ARTE -good to have users demo before infrastructure presentation- -again misleading what we already implement on grid/glite, e.g image similarity- JS: is the portal demonstrated publicly avaiable? r: yes, url provided BC: computentional power is a plus, but then we come to txt. How do you do txt serach and proper ranking in a distributed environment ? how do you see the search/discovery in a grid environement r: data is distributed BC: federated search is what google etc. has, I talk about integrated search in order to do a proper ranking in your result set you normalise a number of things due to the distribution of the information you search/rank r: if you collect all these information from your sources off-line when the result come from the various sources you should know how to omogenise the results they provide (we are not writing the algorithm ourselves) BC: I was tring to understand how the grid can help r: we avoid to collect all data, we get the indexes and we build a collection description and than we do data fusion with a standard algorithm - we do not provide necessarily better results fast: we expose global statistics of the indexes BC: size of the collection you use r: 3000 txt, 100s videos. the number of recs is not relevant for us, the archives we provided are very relevant r: we are ready to manage millions of different object. from FAO we are arvesting 3M of different metadata objects they also have real content like 20.000 documents thatwe would like to exploit as part of the impect scenario BC: I gess I can ask how the system scale from the result of your test, but the scalability I'm talking about is how do you manage 100s or 1000s of search results for the user ? r: one mechanism we exploit is personalisation, this helps BC: journals do not allow to extract metadata neither the txt, if they have a nodes in the grid could you go out and combine their results ? we are proving to be able to run searches on the grid, remains to be evaluated the quality of the results we can produce JW: using the portal I had problems in finding proper keywords to get back some results new demo on the ARTE portal BC: i consider peer-to-peer good in the location process not in the discovery process. have you an idea where to use peer-to-peer and grid? r: we do not have an answer, we adopt peer-to-peer, grid and web services techology. demo on search on variuos collections ================== IMPECT JS: ESA grid on demnad: you are still using the previous version of glite ? r: grid on demand is based on LCG2.2, but we are ready to switch to glite as soon as the new version will be stable JS: we need at the end of the project to see the same grid infrastructure behind r: yes, this is the objective BC: storage model - are you able to keep track of the updates to be replicated up to the original document that introduced the informatio in the sys ? =================== INFRASTRUCTURE r: initial graphs on line of codes developed at CNR to reply to specific question -missing link bw the scenario and the infrastructure management interface- BC: clarification on the "production" usage of this interface JS: how do you check the compatibility of the services with the DIL framework service profile should declare dependences/incompatibilities with other services and requirements vs the node where it will be deployed. the service provider can register the sw in our infrastructure by providing the tarball of the sw plus the service profilet we verify that all dependences are properly defined after that the sw is in our package repository and is available inour infrastructure BC: where do you publish the requirements for a service to be published in your infrastructure service provider just register a web service WSDL in our infrastructure providing uri and wsdl of the service and this appears as an external running instance than this new servic can be used together with the others to define new services BC: it is up to the administrator therefore to decide which service to use, not to the user? we have to distinguish bw system processes and application processes sys processes are predefined, new application processes are made av available at the proces interface you want to provide an external web service the DL admin decide to include it to its own DL though the wsdl if you want it to become real part of DIL than you have to go thought he process for our sevices: define profile and provide a portlet if appropriate WJ: how do you implement more powered usesrs and normal ussers? r: we have roles: DL admin can define new DLs, DL manager for one DL, users GENERAL: what doues the papaer distributed represent ? r: it reflects the latest status of the services developed =========== EGEE COLLABORATION =========== EXPLOITATION raise interest in diligent from other sw companies as fast BC: what is necessary in order to achieve some of the objectives you mentioned in your presentation ? exploitation of egee communities up to now we did not have a technology to show to possiblke industrial partners, at the ned of the project we plan to have the prove of this interest BC: concrete plans? FAO has many archives distributed and will participate at the IST event Vincent Breton at CNr to build a use case on the study of malaria (eterogeneous info needed in this case: terroitory etc.) Avian flu: need for computing power to extract biomed info from biomed raw data ArcheoGrid: already egee community but concrete need for more complex functionality that diligent could provide need for appropriate level of maturity to get to the market BC: let's say you have a success, how do you go to the market ? r: diligent cannot be a product it can be sold as a service it can be exploited as part of the european grid infrastructure ============= PROJECT MANAGEMENT WJ: reason for closing? r: institute manager dismiss his contract proposal: 3 month extension + tranfert from FhG + 3 new activites BC: about the new activites: have you considered limiting the number of activities r: arte is motivated by the voluntee to promote the usage of the system in the humanity field not only in the scientific field ===============