Speakers
Description
Xrootd has lots of concepts and configuration options. Xrootd also has a set of comprehensive reference documents on its website to explain them. Yet Google's AI-based search engine
seems to prefer other Xrootd-related Wiki pages and how-to documents. We would like to understand the reason, if we ever want to build an "AskXrootd" chatbot. In this work, we developed our own Retrieval Augmented Generation (RAG) system and fed the system with those documents. We learned various techniques to convert the documents to formats that are preferred by Large Language Models (LLM) and RAG systems. These techniques include format converting methods using tools or Python libraries, and format improvement using LLMs themselves. We also learned the limitations of LLM/RAG systems: they are good at providing answers by following examples, rather than answers based on a comprehensive understanding of the documents. We will share our experience and lessons learned in this exercise. Finally, we make our RAG system available to the public via a MCP (Model Context Protocol) server, and provide a simple example configuration to access the RAG/MCP server using Google's free Gemini CLI.