Mar 25 – 29, 2019
SDSC Auditorium
America/Los_Angeles timezone

Text Classification via Supervised Machine Learning for an Issue Tracking System

Mar 25, 2019, 2:00 PM
E-B 212 (SDSC Auditorium)

E-B 212

SDSC Auditorium

10100 Hopkins Drive La Jolla, CA 92093-0505
End-User IT Services & Operating Systems End-User IT Services & Operating Systems


Martin Kandes (Univ. of California San Diego (US))


Comet is SDSC’s newest supercomputer. The result of a $27M National Science Foundation (NSF) award, Comet deliverers over 2.7 petaFLOPS of computing power to scientists, engineers, and researchers all around the world. In fact, within its first 18 months of operation, Comet served over 10,000 unique users across a range of scientific disciplines, becoming one of the most widely used supercomputers in NSF’s Extreme Science and Engineering Discover Environment (XSEDE) program ever.

The High-Performance Computing (HPC) User Services Group at SDSC helps manage user support for Comet. This includes, but is not limited to, managing user accounts, answering general user inquires, debugging technical problems reported by users, and making best practice recommendations on how users can achieve high-performance when running their scientific workloads on Comet. These interactions between Comet’s user community and the User Service Group are largely managed through email exchanges tracked by XSEDE’s internal issue tracking system. However, while Comet is expected to maintain a 24x7x365 uptime, user support is generally only provided during normal business hours. With such a large user community spread across nearly every timezone, the result is a number of user support tickets submitted during non-business hours waiting between 12 hours to several days for responses from the User Services Group.

The aim of this research project is to use supervised machine learning techniques to perform text classification on Comet’s user support tickets. If an efficient classification scheme can be developed, the User Services Group may eventually be able to provide automated email responses to some of the more common user issues reported during non-business hours.

Primary author

Martin Kandes (Univ. of California San Diego (US))

Presentation materials