Speaker
Description
In today's research landscape, managing and processing a high volume
of data has become crucial in many fields. Many researchers make use
of remote computing resources to process large data volumes.
High-volume data transfers between research institutions and
High-Performance Computing Clusters (HPCC) have thus increased in importance,
as large data sets can require hours or days to transmit in
non-optimized settings. Moreover, adhering to security compliance
requires the use of firewalls, which are often costly and / or slow
down data transfers.
To resolve these issues, we have developed Hercules and
LightningFilter, which make use of the SCION next-generation Internet
to achieve security and efficiency. The Hercules data transfer
application provides a high-speed implementation, offering sustained
transmission and reception speeds of around 100 Gbps including
reliable delivery as well congestion control. LightningFilter is an
open-source firewall implementation, which can process minimum-sized
packets in excess of 100 Gbps on a standard mid-range server.
LightningFilter can satisfy firewall compliance rules, and enables
ASes to cryptographically verify, restrict, and police the incoming
connections, whether from other ASes or specific hosts, allowing the
HPCC to implement distinct rate limits for different universities
while ensuring a guaranteed throughput for particular hosts. The
open-source implementation of these tools facilitates a low-cost yet
high-performance file transfer service.
A key component of this architecture is the deployment of data transmission
nodes, which play a crucial role in optimizing data flow. These nodes,
strategically positioned within the network, facilitate high-speed data
transfers and ensure reliable connectivity between the HPCC and researches.
A new development in this infrastructure is the integration of Hercules with
the File Transfer Service (FTS) through the gfal2 library. This integration
streamlines the data transfer process, enhancing efficiency and reliability. By
leveraging the gfal2 library, data transfers can be integrated into existing
data processing pipelines, bridging the gap between diverse systems and
technologies, and allowing for more flexible and robust data handling
capabilities.
We are excited to present the latest advancements in SCION-based Science DMZs
and share insights from further deployments and proofs of concept, highlighting
the tangible benefits this infrastructure offers to the research community.