- Guido Aben (AARNet)
- Jakub Moscicki (CERN)
We'll present two recent innovative file syncing technology in Seafile: the Drive client and real-time backup server.
Seafile Drive Client
First introduced by Dropbox in around 2007, file syncing has become a more and more common technology in the last few years. Services like Dropbox, OneDrive, Google Drive are more or less similar to each other: syncing/replicating files across...
Sync and share services like ownCloud often rely on database backends for storing metadata. Those databases should offer a high availability and performance. With a clustered database, both of these requirements can be fulfilled.
One method of running a database cluster is a master-master Galera replication. In real life, the database performs more robust, if write requests are sent to a...
Research of performance of cloud synchronization services like ownCloud, Seafile and Dropbox has shown, that on-premise services show better performance characteristics than public clouds syncing big files (higher transfer rates in both upload and download could be obtained due to simple implementation and smaller activity of users for specific bandwidth) and are very competitive syncing...
A study of using a distributed microservices architecture for synchronization and sharing
One of the main challenges of on premise file sync and share solutions is scalability. It is essential that solutions scale from small to very big installations with millions of users and petabytes of files. This talk will present the current approaches to scale a system including a case study how to scale to millions of users. It also presents a new approaches to bring the scalability of on...
ownCloud uses the filecache table to propagate filechanges in the file hierarchy.
Under heavy usage this causes the table to become a bottleneck because multiple UPDATES
might wait for a lock on the same tupel. By storing the metadata directly in a
filesystems extended attributes and ACLs we can completely get rid of the filecache table.
Using existing filesystem capabilities we can scale...
Scientific exploration and exploitation of data is undergoing a
revolution as communities explore new ways of analysing their data.
One solution that is being used increasingly is sync-and-share, where
data, presentations, graphs and code are shared in an ad hoc fashion.
This allows commuties to explore data in new and innovative ways.
Sites that have already invested in dCache to solve...
We started CERNBox in 2013 as a small prototype based on a simple NFS storage and one of the initial versions of the Owncloud server. Some 3 months and 300 users later we have had enough of enthusiastic feedback to consider to open the sync&share service at CERN. Since then we witnessed a rapidly growing service in terms of number of accounts, files, transfers and daily accesses. At the same...
Seafile is a scalable and reliable sync&share solution. Its synchronisation engine and data model is based on git concept adapted to dealing with large files and datasets. Seafile synchronises data based on filespace snapshots rather than per-file or per-data object versioning and involves deduplication with Content Defined Chunking algorithm. The architecture and implementation introduces...
Nowadays, due to the data deluge and the need for the high availability of data, online file-based data stores have gained an unprecedented role in facilitating data storage, backup and sharing . Up to date, the role of these file storage systems has been, largely, passive i.e. they host files and serve files to clients upon request.
The simplistic approach of these file data stores...
At Dropbox, with 1000s of MySQL servers, failures like hardware errors are normal, not exceptional. There is no day passing by without replacing at least 1 server with some kind of hardware error. Our on-call engineers are not alerted for these, they are alerted if the automation is not working properly.
This kind of automation is harder with stateful systems, so we wrote a general framework...
Experiences of containerising a traditional software stack
This presentation will be about the experiences of AARNet of converting what is a traditional monolithic software stack to run inside a fully containerised and dynamically provisioned Docker based container system.
The entire stack, from the front end TLS proxies to the backend scale out storage, the metrics,...
The IT Storage group at CERN develops the software responsible for archiving to tape the custodial copy of the physics data generated by the LHC experiments. This software is code named
CTA (the CERN Tape Archive).
It needs to be seamlessly integrated with
EOS, which has become the de facto disk storage system provided by the IT Storage group for physics data.