Speaker
Description
The result of many machine learning algorithms are computational complex models. And further growth in the quality of the such models usually leads to a deterioration in the applying times. However, such high quality models are desirable to be used in the conditions of limited resources (memory or cpu time).
This article discusses how to trade the quality of the model for the speed of its applying a novel boosted trees algorithm called Catboost. The idea is to combine two approaches: training fewer trees and uniting trees into huge cubes. The proposed method allows for pareto-optimal reduction of the computational complexity of the decision tree model with regard to the quality of the model. In the considered example number of lookups was decreased from 5000 to only 6 (speedup factor of 1000) while AUC score of the model was reduced by less than per mil.