Speaker
Description
Detailed analysis
This work exploits the gsiftp and gridftp logs for providing six relevant fields: identifiers for the user and requesting machine, direction of the request (get or put), file name, transfer size, outcome (success or failure) and timestamps. As such, the planned tool will be able to represent the output of any monitoring providing similar information. Two graphs capture the data-user relationship: the bi-partite data access graph, with users and file as nodes, and edges connect a user with the file she uses; the data-sharing graph [1], in which nodes are users and edges connect users with similar interests in data. As in most SNs, our data require true multi-scale visualization, but with a strong requirement for smoothness and continuity: for instance, the files popularity is much more heavy tailed than a classical Zipf's law. Moreover, useful attributes can be extracted from the file names (e.g. output error files versus data files), contributing to the need for interactive selection of the representation. Finally, the system should allow easy manipulation of the time frame and create representations with the appropriate level of detail.
Impact
The Grid Observatory experience concurs to the general urgent requirement for information visualization. The challenges of visualization for SN exploratory analysis have been recently formalized from users' requirements [2,3]. One of the most important is that scientists do not want just to draw a graph, but 1) to build several representations according to the different attributes and 2) to be able to compare and identify a consensus among actor clusters across representations. In order to build a state of the art framework, this work is realized as collaboration between the INRIA AVIZ group and the Grid Observatory, with support from the Paris-Sud University. The AVIZ group has developed the GraphDice [4] environment, a multivariate network visualization system for exploring the attribute space of edges and actors. This environment has proved to be both user-friendly and information rich. Moreover, the capacity to deal with a high number of attributes is essential for future integration with the other source of information provided by the Grid Observatory, for instance the actor's VO, their usage of computing resources and the failure events.
Conclusions and Future Work
Resource sharing is the specificity of Grids. Identifying and characterizing the sharing patterns amongst data and users gives a partial view of the more elusive e-science network, which is ultimately about the research the users are actually conducting with that data. Such characterization is also a requirement for addressing both long-term dimensioning and short-term allocation of resources. Building on state of the art research, we propose to contribute to these goals through information visualization. The planned tool exploits only generic information and will thus be of general usage.
[1] A. Iamnitchi, M. Ripeanu and I. Foster. Small-World File-Sharing Communities., Infocom 2004, Hong Kong, March 2004
[2] A. Aris, B. Shneiderman. Designing semantic substrates for visual network exploration. Information Visualization 6, 4 (2007), 281–300.
[3] N. Henry and J-D Fekete. MatrixExplorer: a dual-representation system to explore social networks. IEEE Transactions on Visualization and Computer Graphics 12, 5 (2006), 677–684.
URL for further information | www.grid-observatory.org |
---|---|
Keywords | interactive visualization, social networks |