Speaker
Description
Yet Another Rsync is a Python wrapper around a well-established Linux tool rsync with a simple and familiar interface of git. Python allows us to create a higher-level instrument, which is safer and sometimes more efficient than the original binary.
While many data analysts today heavily use databases and rely on cloud computing, other approaches have also their benefits. Many data kinds are difficult to represent in relational databases or it takes time to do that. Files in a user-defined format become a simpler and more general solution, which is often less expensive and error prone. Linux servers take a considerable share today, and many data analysts also use Linux as a good programming environment. Our approach is inspired by data analysis workflow in HEP. We shall tell about creating data repositories with yarsync, relevant rsync features and how the tool will assist against possible problems in data synchronization.