RGMA in LCG :-D like to deploy it as part of LCG service... net mon, pub and agg of acc info, and job monitoring. - approach the deplyment of R-GMA incrementally - individual sites choose. - or do we go through certification testbed? - this will be getting approval soon. ROAD MAP end of feb: set up rgma server in cert. testbed - GOC accounting app will be initial app - tested by 15/3 at cern - so rgma only set up at a few places. - the stuff jason is working on. - job monitoring for LCG (new) (later on) - on EIS testbed... then deploy new funlty on LCG-2 sites - net mon - set up in 2 steps. - doesn't need rgma, but set up basic architecture - extended monitoring, needs rgma (use same one as acc and mon) What shall we give them? - no GIN/GOUT in plan, but would like this. - which version shall we give them? - Can we give them a subset that doesn't include GIN/GOUT. LCG2 will be with RGMA and no MDS! - as move into egee, the existing app testbed will morph into the lcg testbed, and will start up with lcg2 plus rgma. Need to fix HEAD asap: ---------------------- - try out new pieces on existing cvs - make sure head is great for lcg - getting on with new design in cvs. GOUT does get problems... 10 restarts a day, and this needs to be fixed. Remedy has been to restart GOUT. Pursue deadlocks in existing code... don't believe it is caused by networks (machine is down). =================================================== ARDA DOCUMENT discussion =================================================== - to be approved by GAG body - "this is what we want in ARDA prototype". it contains API definition. Easy to add, but v hard to remove (need to mark as deprecated, etc). - API defn is in C - user operations + admin operations + middleware API Chosen C because "what users are familiar with" (but actually physicists have now learnt C++ after ditching fortran) - but ask GAG? C is wrong answer for us... but if they agree we are stuffed! Every call in C should return an error code! (so it alters the input objects). yuk! - multithreading => problem that each error has the same error number. - then call function with error number and get error string back. (but passing a pointer without saying the size of the buffer). Peter K has suggested a different API for the middleware - however this is probably best seen as an API for our use - e.g. write to the registry. --------------------------- WSDL - still need APIs, to handle connectionId. Could (in principle) write code to generate wrappers. --------------------------- NOW - we need a best approximation to what we expect to want in 2 years time. ============================================== API Design - Steve H ============================================== use interface instead of classes. - so interchangable implementations - specify factory impl. at run time (passing in a parameter as a string). Time Interval only has one long field... - it should have more constructors to make it useful - including specification of units QueryType - new Consumer("...", CONTINUOUS) - where CONTINUOUS is a constant QueryType object. cache in API stubs (e.g. isTableInSchema()) are table/cols case-sensitive? need to address this. Insert - tuple object or tuple list. - (need to avoid excessive method overloading, as hard in c) Eliminate ProducerConnection and ResilientStreamProducer Refactor message cat - to get meaningful client error codes. - split into client and server. - should be able to retrieve error codes. - use mnemonics rather than having to rememebr numbers for error codes RGMAException: ditch support for type, source. - chained exceptions - subclasses of RGMAEXception - e.g. api, networking and servlet problems... Hierarchy. - ditch APIBase? - Resource (with termInt should be top level). Do we need Registry & Schema - not in first instance. Browser could use a private api. PublisherDescriptionList getPublishersForTable(tableName) PublisherDescriptionList getPublishersForTable(tableName, queryType) ResultSet: include RGMAWarnings, etc. Schema : getSchemaTables() Vector getPrimaryKey(tableName) ResultSet getTableInfo(tableName) -> (columnName, columnType) Vector getColumnNames(tableName) Vector getColumnTypes(columnNames) createTable(desc) dropTable(tableName) getTableDesc(tableName) Need to be able to add comments to tables and attributes and extract those comments Handle case sesnitivity properly with SQL Consumer: ditch start() ditch get/set tuple checking if answer is emptyset, should empty resultset be returned? (then can add a warning!). - ditch blockingPop? - two different consumers? - isExecuting() not useful for cont. consumers? - start() difficult for one-time consumers. - so always set a timeout for cont. and one-time queries? Insertables (StreamProducer): - setRetentionPeriod()? (i.e. no need for both memoryRP and DBRet.Period?) - parameter should be set in constructor - autoInsertTimestamp... is this needed - NO Republisher: don't pass producer into constructor? use parameters instead? (then easier to set producer as a republisher in the registry). ditch tupleChecking! We will need the DataBaseProducer back for publishing from existing databases. Insert should take a tuplelist which can hold tuples ================================================ Steve F's API thoughts ================================================ support for multiple VOs? Producers - publish to one or many vos as you want. - e.g. net mon publish to all. - e.g. atlas only publish to atlas vo. - (in principle, users can belong to > 1 vo). - VOMS supports this, but apps don't at moment. - users will have to specify VOs in constructors - explicitly state which VOs they want to publish to. - but don't want to have VOs hardwired into code. - so need to make a call to proxy: "what vos am I in". - each VO gets a separate logical registry. Registry - should all support all? - probably not (then more scalable). - should consumers query > 1 VO? - probably yes. - VOs decide which registry to use. problem: do we define constructor for C/P now? Or deprecate constructors that don't take a list later? problem: mediation! ======================================== Redesign Ideas - Steve H ======================================== split apis into - User api insert() - Factory api createConsuemr()... - System api startStreaming()... user WSDL system WSDL is subset of user and facgtory (public), and system (private) WS-Resource Framework ================================== - help us create stateful services. - it isn't standard wsdl...useful to have a tool that notes the connectionId Stateful Services - resources - connectionId (put into SOAP header) (could pass in as method parameter, but...) WS-ResourceLifetime - destroy(), initialTerminationtime(), currentTime(), setTerminationTime() (instread of setting interval, u set currenttime and term time). WS-Addressing - Endpoint Reference (URL, id) - replace ServletConnection. Streaming ========= transport layer (tcp/ip sockets) data protocol (what consumer you send data to) data format (xml) app. logic (push set onto stack). split into modules... then can plug in different impls for different jobs (e.g. get rid of xml and replace with object serialization) Which thread should stream data for producer? - HTTPProcessor sends off data, before returning... - it blocks your insert call though. - one IO thread How should consumerId be sent? - one consumer per connection just now... - or wrap tupleList with consumerId? and have one connection? closing sockets down if now data? - useful, e.g. for service table which pubishes once an hour. Refactoring =========== - packaging - sql parsing... - database access (executeUpdate dotted everywhere). - refactor tests... Producers ----------- define basic behaviour in interfaces: - TupleStorer, - TupleStreamer then plug in different imps (at backend): - MemoryTupleStorer, DBTupleStorer, TCPTupleStreamer, JMSTupleStreamer API is one interface. Minor behavioural differences created in factory methods: - need factory methods... - or template class? to avoid lots of factory methods. A parameter list object (to avoid lots of arguments)... CanonicalProducer - publish on demand... Schema Replication - Rob ================== This talk took place in the coffee lounge. Jason suggested a distributed master approach. When attempting to declare a table to the schema, if the table does not exist then a request would be sent to the master schema for that table name by using some function of the table name. The only problem to be solved is then how to add and remove schema instances. Rob will ponder this. Authorisation - Linda ============= Linda had produced a presentation on authorisation in R-GMA in general rather than concentrating on the API aspects. She will prepare new material for discsussion on the 25th Feb 2004.