Speaker
Description
Traditional server network monitoring relies on specialized tools and complex queries, demanding significant domain expertise and being time-consuming. We propose a Digital Twin (DT) framework that provides a real-time, unified model of network behavior, enabling intuitive natural-language interactions powered by large language models (LLMs).
The DT fuses live telemetry from monitoring systems with structured knowledge in developed network ontologies, leveraging retrieval-augmented generation (RAG) for precise querying of server network documentation.
This integration delivers comprehensive insights into current operational states and trends, empowering administrators to analyze status and possibly optimize server network architectures with enhanced precision.
Our approach substantially reduces human effort, improves system reliability, and facilitates innovative network designs. Notably, it addresses server network operations distinct from data center infrastructure management (DCIM) challenges.
We plan to present prototype demonstrations and development experiences applicable to diverse data-taking networks in high-energy physics (HEP) experiments.