Deep learning architectures in particle physics are often strongly dependent on the order of their input variables. We present a two-stage deep learning architecture consisting of a network for sorting input objects and a subsequent network for data analysis. The sorting network (agent) is trained through reinforcement learning using feedback from the analysis network (environment). A tree search algorithm is used to examine the large space of different possible orders.
The optimal order depends on the environment and is learned by the agent in an unsupervised approach. Thus, the 2-stage system can choose an optimal solution which is not know to the physicist in advance.
We present the new approach and its application to various classification tasks.