Attendees:

WP1: Fab
WP2: Leanne
WP3: Antony 
WP4: Piotr
WP5: Jens
WP6: Cal
WP7: Gareth
WP8: Jeff
WP9: Annalisa, Julian
WP10: Johan
WP12: Erwin
SCG: -
External: Lee, Sergio


- Discussion of EGEE
  Erwin reported on the latest developments in EGEE. 
  The EGEE exec committee has asked the ATF to provide input for
  question 2 of the reviewers:
  - slides produced and sent to Bob for EGEE exec committee

- open bugs and issues
  - jeff explained the file access issue 
    jeff will send a script; julian will send his conversation with
    Peter 

    => action on Fab, Jens, and Leanne to look into it.

  - long running job usecase - akos sent a new diagram - Fab will check it. 

  - Leanne still to read D9.3

  - long running job (Fab): after 5-7 minutes network interruption job
  is considered to have failed and will be re-submitted. At this point
  the old job is killed by condorG - so it will be killed once the
  network comes up again. If the job does something during the network
  outage that cannot be prevented.

  => action on Fab: discuss with Condor/globus people if better error
  messages could be obtained from them

  => action on Fab: add a paragraph to the user guide mentioning the
  problem in the section discussing the re-submission.


Architecture review of components to be added to 2.0:
-----------------------------------------------------
- WP1: (slides on agenda page)
  add DagMan and Job partitioner
  DagMan can only deal with temporal dependencies (job start/end), no
  data dependencies or general events. 
  
  the jdl is nested; each single job is separately specified inside
  the overall jdl - currently the single jobs have to be explicitly
  there, not just another jdl - that would be useful.

  nested DAGs are possible in principle - but needs testing; not sure
  whether it will be supported.

  parameterized job description (e.g. 'submit 50 jobs with the same
  executable and parameterized input data') is being studied; not sure
  whether it can be supported.

  - job partitioning:
  requires jobs to be checkpointable
  semantics of job partitioning not completely clear
  job needs to be specifically prepared to use that feature
  only applicable to specific use cases - not a general
  parallelization tool
  jobs with side-effects may not be partitionable - up to the user to
  check
  binding of input data to job steps not completely clear, it's
  definitely restricted. 
        
  job checkpointing and partitioning document in WP1 edms - please read
  and discuss on the mailinglist.


WP2: (no slides) 

  Leanne explained that the only addition to 2.0 will be the full
  deployment of RLS - the replica manager will be modified to work
  with it. Applications should use the replica manager to find out
  where their data is and not RLI/LRC interfaces directly. If you use
  those you will have to follow the topology.

  The following things will not be deployed: file pre-fetch,
  collections, RSS, proxy services. 

  All services will be fully integrated with VOMS

  Deployment: currently 1 LRC per VO. 
  2.x: 1 LRC per SE - should handle all VOs. 
       1 Tomcat container and 1 mysql instance - the VO LRCs can be
       separate services inside that. Leanne is currently working on
       such a setup. 
       
       RLI deployment not yet clear - depends on size of testbed (1
       per VO, 1 per country, 1 per task, ...)

  For each new RLI, the LRCs have to be configured to send updates to
  it. LRC updates are soft-state (using bloom filters) with a
  configurable time interval. The time-interval before new information
  is pushed to the RLI might be critical, in particular with DAGMan
  scheduling when a dependent job relies that the data has been
  produced by the other job.

  What's the policy for assigning LFNs to concurrent competing
  requests - first come, first served. LFNs are stored in the RMC.


WP3: (slides on agenda page)
  mediator:
  ---------
  current: answer simple query from one table
  next week(?): joins over tables which are within one archiver
                (storing full tables)

  medium to long term: 
     joins with archivers that don't archive whole tables
     hierarchies of archivers


  registry replication:
  ---------------------
  set of registry instances, geographically dispersed
  each registry will have all the information about all the producers
  and consumers within a VO. 
  there is no 'master' registry.

  replication is invoked periodically (might be adjusted dynamically
  depending on the workload inside the system). 


WP4: (slides on agenda page)
  LCAS server: does not require root, use policy description language
  (PDL)
  plug-ins:
   - allowed users (grid-mapfile or allowed_users.db)
   - banned users (ban_user.db)
   - timeslots
   - voms

   authorization based on user certificate and job specification
   - what is job specification exactly used for? there are open
   questions between WP4 and WP1
  => action on Piotr and Fab to work that out

  LCMAPS - provides local credentials for jobs
   - UNIX credentials, AFS tokens, Krb5
   backwards compatible to existing systems (gridmapfile, k5cert)
   needs to run in privileged mode
   has to run in process space of incoming connections

  RMS
  Monitoring and Fault Tolerance


WP5: (slides on agenda page)
  high priority issues:
  - migration tool
  - asynchronous requests
  - SRM v1 interface
  - additional SRM functionality: exists, delete
    - depend on disk cache mgmt if done properly
  - collaboration with WP9&10 (WP10 issues have been discussed in a
    break-out session)
  - support for edg 2.0
  - documentation

  Jens also explained what will not be in the SE as currently
  planned. 

Use case changes: 
  => action on Lee: put diagrams to CVS
  WP1: interaction of job-wrapper with local logger needs to be added
  WP2: RLS changes are internal
       attribute type needs to be specified now as well if a new
       attribute is being created
  WP3: check R-GMA (gin, gout)
  WP4: all internal
  WP5: check asynchronous calls
       update to SRM notation

Baseline API: 
Leanne received all interface rpms needed. 
New version hopefully end of June.


ETT Discussion: 
---------------
Cal summarized his document (see text attached to agenda)

Jeff presented work done by a master student in Leiden (see slides at
agenda): in addition to what is listed on the slides it's also
important to take different priorities assigned to VOs at specific
sites. 

Proposes to use the Maui simulator - but there are concerns about the
runtime for doing that. Uses statistics of information in the LRMS on
historic jobs.
With R-GMA the script could be run in a canonical producer once it is
queried -- might take too long. Better to run the script in fixed time
intervals and publish the results in the info system. 

Julian: maybe we should think of different architectural approach:
e.g. CE is actively polling for new jobs at brokers. I.e. the broker
is managing the queues of the CEs. 

Question to Sergio: how can ETT be published per VO in glue?
ETT could be multivalue - then the VO needs to be encoded in the value
and the broker (actually not the broker - but the user would have to
specify an appropriate expression in the JDL) would have to parse it. 

Conclusions:
- complete architecture change (as discussed by Julian) is not taken
into account for the moment. 

- the simulator developed at NIKHEF looks promising and is worth
further testing. Main concerns are: runtime, load on CE, support for
all required batch systems. Jeff to report on these issues. 

- computation at CE (e.g. via canonical producer) for every job seems
to be too computational intensive - better have a daemon that
periodically produces the data and publishes it into the info system. 

- Not yet clear how to publish it in Glue - multivalue attribute seems
to be too complicated. Various possibilities should be proposed to
broker developers of WP1 to get their feedback. 

=> Action on Fab&Sergio: discuss with broker developers. 


Outbound connectivity: 
----------------------
WP1: 
- job manager contacts the WMS to update the status: job-wrapper logs
  to a local logger on the CE which pushes the information forward to
  the LB - so no problem. 

- transfer of input/outputsandbox: this currently requires outbound
  connectivity but with GRAM 1.6 (included in GT2.2, CondorG v6.5) files
  could be staged the same way as the executable - this should solve the
  problem. The features are already distributed in the current EDG
  distribution but not turned on - needs testing.

- interactive jobs: this requires outbound connectivity - not clear
  how this could be prevented. Would be interesting to get the opinion
  from Condor people on this issue. 
  => Action on Fab: ask condor people.

WP2: 

- service discovery: 
  replica manager needs to contact IS. 
  Discussed in WP3 section

- output data registration: 
  replica manager client needs:
  - registration in LRC - OK in 2.1, but not in 2.0
  - registration of metadata and LFN - needs access to RMC - not ok.
    - NAT would solve it, or
    - needs RMC proxy service, probably on CE, would probably need
    performance tuning

- data lookup:
  replica manager client:
  - needs to contact RMC to resolve LFN to GUIDOs - not ok.
  - needs to contact RLI to resolve GUIDOs to LRCs:
    - scenario 1: one RLI per LRC - OK.
    - scenario 2: sites without RLIs exist - not ok.
                  - NAT, or
                  - RLS proxy service
  - needs to contact all (or subset of) LRCs to find out PFNs
    - scenario 1: only need local information - go to local LRC - OK
    - scenario 2: want to have all replicas - not ok. 
                  - NAT, or
                  - RLS proxy service will handle it. 

- replicate data to site: 
  - replicate to local SE; 
     - 3rd party gridFTP - not ok
       - NAT, or
       - SRMcopy

- replicate data from site: 
  - store in local SE (either directly, or copy there from WN) - OK; 
    registration see above
  - copy to another SE - see replicate above


- SRMcopy: 
  - control goes to source SE
    - copy file out - OK
    - copy file in - not ok
      - need to use SRMprepareToGet instead and the destination SE
        will try to get the data. 


WP3: 

- R-GMA server at each sites - all connections go via it - so it's OK
  - if the server goes down we loose all clients - fault tolerance
  means could contact another server - if they are on the same site -
  OK - otherwise a problem.


Glue Discussion (Sergio) (see slides at agenda page): 
Important: upcoming ATF document should be synchronized with GLUE to
have a common terminology.

Sergio first gave a short overview presentation on GLUE.  
There is currently no mechanism of ensuring consistency among
different implementations of the reference UML model.

Then the CERN setup problem was discussed: have 1 batch system and
multiple gatekeepers to point to it since the gatekeeper doesn't scale
as good as the batch system.

VDT would like to integrate the ldap schema and EDG info providers;
that's fine from an EDG point of view but the packaging must be in
such a way that EDG could take VDT minus info providers since EDG will
most probably be ahead of VDT in that. Sergio will find out with Alain
Roy. 

Sergio suggested a procedure for schema modifications: 
- write short document with proposed modifications and rationale
- send document to mailing-list
- set up phone conference
- proposal should come from a project rather than individuals

WP3 should act as permanent contact point to glue taking part in these
discussions and forwarding change request to concerned WP and
ATF. Needs further discussion inside EDG. 

Things that are not of general interest, but only specific to a
certain domain, typically don't go into the schema but end up in
extensions. Extensions are dangerous since they prevent
interoperability. 
EDG should work to bring its current extensions into the glue schema. 
=> We need somebody responsible for EDG extensions!!!

Discussion on Service proposal: it's not clear where the protocol a
service speaks is specified. Needs clarifications. Sergio would like
to get some examples. Cal will provide past email exchange with Steve
Fisher to Sergio.

Authorization information: Sergio pointed out that the current scheme
proposed has problems since no group information is published. 
WP1/4/SCG need to work out a suitable format. 

Are there attempts inside GLUE to assign exact semantics to the fields
in the schema? - Currently not.

Is the schema capable of expressing whether there is POSIX access
between an CE and SE - yes, via the access point in the bind table.


Architecture document:
----------------------
It is agreed that such a document would be useful. It's structure
should be similar to the one compiled by Lee for the 2nd review. If
possible, a journal publication should be extracted out of this
document. 
The decomposition should be component based not WP wise. 
It should be finished by the time of the final review. 

It will be written in LaTeX and stored in the ATF CVS area. 

Things to be done: 
- Editor - Lee proposed. (timeframe: 1 week)
- Decide on a TOC - will start that via email (~2 weeks, start now
  based on TOC of two documents mentioned)
- Assign sections to people
- (worry about list of authors)

- first draft by time of Heidelberg conference?
  at least final TOC, short description of section contents, people
  assigned. 

probably most of the work will be done in January. 

We will need plenary ATF meetings to go over the individual
contribution and harmonize them. 


Next ATF: 
Heidelberg Conference: 
26th Sep. afternoon - 1st Oct. noon. 
Tentative ATF: Saturday 27th Sep. (1/2 day)