Variational Autoencoders for Anomalous Jet Tagging
- CMLBDLDRL
We present a detailed study on Variational Autoencoders (VAEs) for anomalous jet tagging at the Large Hadron Collider. By taking in low-level jet constituents' information, and only training with background QCD jets in an unsupervised manner, the VAE is able to encode important information for reconstructing jets, while learning an expressive posterior distribution in the latent space. When using the VAE as an anomaly detector, we present different approaches to detect anomalies: directly comparing in the input space or, instead, working in the latent space. Different anomaly metrics are examined. A comprehensive series of test sets are generated to fully examine the anomalous tagging performance in different jet types. In order to facilitate general search approaches such as bump-hunt, mass-decorrelated VAEs based on distance correlation regularization are also studied. Confronted with the problem of mis-assigning higher probability to out-of-distribution samples, we explore one potential solution -- Outlier Exposure (OE), in which outlier samples are utilized to guide the learning heuristics. OE, in the context of jet tagging, is employed to achieve two goals at the same time: increasing sensitivity of outlier detection and decorrelating jet mass from the anomaly score. We observe excellent results from both aspects. Code implementation can be found at https://github.com/taolicheng/VAE-Jet
View on arXiv