In the era of data-driven innovation, modern storage technologies are pivotal in addressing the exponential growth of information and the demand for higher performance, efficiency, and scalability. This presentation explores key advancements shaping the storage landscape, including on-premise/cloud storage tiering, computational storage, archival storage, and ultra-fast NVMe SSDs. In this...
On May 15 2024, about 30 participants from our community joined an online workshop about JupyterHub. Among them was also the project lead for the JupyterHub project. Seven presentations were given and after each one the participants were able to discuss and ask questions. The workshop was concluded with a longer community discussion.
This presentation will highlight some key take-aways from...
SWAN (Service for Web-based Analysis) is CERN’s cloud-based platform that streamlines scientific data analysis and collaboration by offering users an integrated Jupyter-based environment with seamless access to resources such as EOS/CERNBox and CVMFS. In this talk, we will present the current state of the project and highlight the advancements made in 2024 to address new use cases. These...
Since 2017, the cosmo sim web portal allows accessing and sharing the output of large, cosmological, hydro-dynamical simulations with a broad scientific community and contentiously grows in services and data which are made available. It is based on a multi-layer structure: a web portal, a job control layer, a computing cluster and a HPC storage system. The outer layer enables users to choose...
Does the idea of running and managing “own” web server for every user with the corresponding user/group IDs and securing access to it sound crazy? Definitely 10 years ago it was, but not in the current Cloud Native world. KubePie facilitates streamlined access to end-user data by harnessing Kubernetes' scalability and deployment capabilities to...
On Nov 20 2024 about 12 participants and developers from our community joined an online workshop about Open Cloud Mesh. On top of the three presentations, a lively technical discussion took place to address issues such as security and discovery in a federated cloud environment.
This presentation will highlight the main take-aways and serves as an introduction for the Campfire session about OCM.
In our talk at CS3 2025 we will discuss the EOSC EU Node-provided sync & share service and its contribution to the horizon of the sync & share services for research and academia in Europe.
EOSC EU Node is a set of data-centric cloud services supporting Open Science, Open Data and FAIR. The node is owned by European Commission and implemented by a group of contractors. PSNC is the main...
In the current implementation of federated sharing between OCM-compliant sync-and-share services require the exchange of Federated Cloud IDs by end users. Now although this approach works, it is not very user-friendly. During the course of the EU-funded CS3MESH4EOSC project a so-called invitation workflow has been implemented in golang based on REVA-based sync-and-share services like CERNbox....
In the GN5-2 project The GÉANT Association will develop investment proposals for three service concepts, with a large, long-term expected impact that may require significant future investment and commitment from the NREN community. Each service concept will only succeed if a concerted collective effort is made, over time. The presentation will present each concept at high level, explain the...
Learn how Host-Managed Shingled Magnetic Recording (HM-SMR) drives and selective write-grouping can transform software-defined storage (SDS) environments. Selective write-grouping and Popular Data Concentration (PDC) both work with Shingled Magnetic Recording (SMR) and Conventional Magnetic Recording (CMR) disks using erasure coding. By restricting write operations to fewer drives, selective...
We report on the new, CERNBox-based stack, incorporating both Samba and CephFS, which is now providing Windows storage at CERN.
For nearly 30 years, the Distributed File System (DFS) ecosystem served as the main data storage platform for the Windows operating system at CERN. With a new demand for collaboration and access from anywhere worldwide, CERNBox has become the natural candidate to...
OpenCloud comes with a never before seen level of integration with the underlying storage system. It allows transparently accessing files on a posix filesystem. The new driver allows users to either work with OpenCloud or use external tools that directly access files on the storage. Local filesystems as well as enterprise network filesystems have already been integrated. OpenCloud picks up...
In today's fast-paced academic and research environments, efficiency is key. This presentation introduces n8n, an open-source workflow automation platform that can transform how educational institutions and researchers manage their daily tasks and data processes.
Key Points:
- Seamless Integration: n8n connects with over 350 applications, allowing for easy automation of tasks across various...
When proprietary products are discontinued or bought up by competitors, your own (decision-making) freedom can quickly become precarious. A discontinued product forces you to migrate, incurs costs, and imposes unwelcome decisions.
Not so with open source software: it offers me the freedom to develop the code further at any time, either on my own or with new partners, and to set up new...
In this contribution we will touch on the recent developments and the operational experience running Reva as part of CERNBox, the CERN cloud storage, and on our plans to support Reva and the CS3APIs for the community.
Having the CS3APIs reached a good level of maturity, in the past year we consolidated the Reva implementation and improved its dependability, in particular with respect to...
The Educloud service at University of Oslo has provided our researchers and collaborators access to a suite of tools for collecting, storing and sharing data, and contains a suite of services including a Nextcloud service.
After a survey of the different on premises and cloud storage services and sync tools we currently offer, we have seen the need for a more strategic approach to end user...
Elettra is an multidisciplinary research center running two particle accelerators producing synchrotron light. In 2017 we looked into software that will replace our aging Windows file share server, used for hosting non scientific data. This research led us to Seafile. My presentation will introduce you to our initial reasoning, it will also show you the evolution of our Seafile infrastructure...
Backed by the 20 years of successful development and operation of the largest Italian research e-infrastructure through the Grid, the Italian National Institute for Nuclear Physics (INFN) has been running for the past four years INFN Cloud, a production-level, integrated and comprehensive cloud-based set of solutions, delivered through distributed and federated infrastructures.
INFN Cloud...
This talk will provide an overview of OpenCloud, including its team, core features, and future plans. We will describe the platform's architecture, its capabilities for data storage, processing, and collaboration, and present the roadmap for upcoming updates and enhancements. The session will offer insights into how OpenCloud aims to support scientific research and data-driven projects.
This talk will give an overview of the Nextcloud developments and improvements in the last 12 month. Several noteworthy things happened in the last Nextcloud releases. From architectural improvements to changes on APIs and the sync engine, to usebility and functionality. This Talk will give a full overview.
In this session, we'll provide an in-depth look into the latest achievements and features of ONLYOFFICE, a leading open-source office software project with focus on secure document processing.
We will cover the following:
• What are the novelties of ONLYOFFICE over the year, including a full-featured collaborative PDF Editor, integration news, etc.
• How to organize effective teamwork...
Join us to hear about the latest work from the world of Collabora Online (COOL). Hear about the new integration points and APIs that let us create a richer integration between storage and our security focused, truly open-source, online office suite.
In this session we’ll show you why File Sync & Share and LMS provisions are integrating Collabora Online into their products. Hear how EFFS can...
SeaTable is the world leading self-hosted no-code platform. SeaTable enables you to develop and build efficient business process in the shortest possible time. You can easily design your database structure, store any kind of data, define access rights for your team or externals and visualize your data with various charts. Automations help to streamline your work. Digitalization or creation of...
In an era where web search serves as a cornerstone driving the global digital economy, an open, impartial and transparently produced web index is a key opportunity for Europe and beyond. Currently, the landscape is dominated by a select few gatekeepers who provide their web search services with minimal scrutiny from the general public. Moreover, web data has emerged as a pivotal element in the...
In today's research landscape, managing and processing a high volume
of data has become crucial in many fields. Many researchers make use
of remote computing resources to process large data volumes.
High-volume data transfers between research institutions and
High-Performance Computing Clusters (HPCC) have thus increased in importance,
as large data sets can require hours or days to...
PSDI is the UK nationally funded programme that analyses physical sciences needs in a common data infrastructure and develops guidance, training and technology to address these needs. The PSDI main objective is to serve research use cases originating in experimental “bench science” and simulations with applications in physics, chemistry, materials research or engineering. The main challenge to...
A significant amount of research data remains underutilized due to being unpublished or poorly described, leading to a loss of funding and scientific potential. To recover lost data and prevent further waste, researchers must be encouraged to use FAIR data sharing repositories and adopt good publishing practices, such as providing descriptive metadata and utilizing datasets from the community....
In the recent atmospheric and oceanic measurement campaigns (EUREC4A and ORCESTRA), we use the InterPlanetaryFileSystem (IPFS) to store, use, synchronize and share measurement data. IPFS uses content addressing (instead of location-based addressing) and provides an easy-to-set-up peer-to-peer network for sharing data on servers and portable devices.
Because of these features, we were able...
From the perspectives of different data (re-)use cases (from University Library, Research Funding Support, Super Computing Centre and LMU Physics Department), this talk will focus on the many aspects of FAIR data. In practice, data should be handled in accordance to the FAIR (Findable, Accessible, Interoperable, Reusable) principles – but what does this mean in scientific day-to-day work?...
We have been running sync-and-share services for about 11 years now and there has been a recurring question from our users is the ability to "park" data from finished projects to somewhere else. Somewhere else often also means a place where others can find it. For this reason we have developed SURF Research Data Connector (SRDC). This is a service sitting between the sync-and-share service and...
Data repositories play an essential role in Research Data Management according to the “FAIR principles” (Wilkinson et al. 2016) and leave less and less to be desired. However, they usually cannot accommodate huge datasets towards the PB range, as they e.g. come from supercomputing - for technical, financial or organisational reasons. In fact, for such datasets even movement to an external...
In the era of data-intensive research and data science, the challenge of managing research data effectively while ensuring FAIR principles isn't just a technical problem—it's a collaborative one. This presentation explores how the needs of researchers, research software providers, and research IT can be addressed together by a commitment to vertical interoperability between research tools and...
As research communities increasingly rely on cloud-native tools and workflows, integrating High-Performance Computing (HPC) environments with cloud storage has become increasingly important. From the perspective of an IT support team, this presentation outlines our initial approach towards a lightweight, easily deployable S3-layer on existing POSIX-compliant storage, enabling scientists to...
Onedata [1] is a high-performance, distributed data management system designed for global infrastructures. It provides seamless access to heterogeneous storage resources and supports diverse use cases ranging from personal data management to large-scale scientific computations. Leveraging a fully distributed architecture, Onedata facilitates the creation of hybrid cloud environments that...
Working on highly sensitive research data is challenging and is often hindered by both legal and technical obstacles. It requires not just secure data storage, but also secure data processing, and a secure collaborative platform. The purpose of this talk is to show how the University of Oslo (UiO), the University of Bergen (UiB) and the Norwegian University of Science and Technology (NTNU) in...
In the early days of artificial intelligence (AI) during the 1950s, two primary approaches emerged. One was engineering-oriented, while the other focused on computational modeling of human decision-making processes, later termed "computational intelligence", and is strongly determined by three fundamental time-constrained limitations: data, computation, and communication. Modern AI development...
Imagine a world where your documents organize themselves, where finding the right file is as simple as asking a question, and where AI does the heavy lifting of managing your digital information.
From legal firms to healthcare, from small startups to global enterprises, AI is reshaping how we interact with our digital information.
In this presentation, I will unveil how artificial...
More information: https://wiki.geant.org/display/CISS/15th+SIG-CISS+meeting