63

Learning to increase matching efficiency in identifying additional b-jets in the ttˉbbˉ\text{t}\bar{\text{t}}\text{b}\bar{\text{b}} process

The European Physical Journal Plus (EPJ Plus), 2021
Abstract

The ttˉH(bbˉ)\text{t}\bar{\text{t}}\text{H}(\text{b}\bar{\text{b}}) process is an essential channel to reveal the Higgs properties but has an irreducible background from the ttˉbbˉ\text{t}\bar{\text{t}}\text{b}\bar{\text{b}} process, which produces a top quark pair in association with a b quark pair. Therefore, understanding the ttˉbbˉ\text{t}\bar{\text{t}}\text{b}\bar{\text{b}} process is crucial for improving the sensitivity of a search for the ttˉH(bbˉ)\text{t}\bar{\text{t}}\text{H}(\text{b}\bar{\text{b}}) process. To this end, when measuring the differential cross-section of the ttˉbbˉ\text{t}\bar{\text{t}}\text{b}\bar{\text{b}} process, we need to distinguish the b-jets originated from top quark decays, and additional b-jets originated from gluon splitting. Since there are no simple identification rules, we adopt deep learning methods to learn from data to identify the additional b-jets from the ttˉbbˉ\text{t}\bar{\text{t}}\text{b}\bar{\text{b}} events. Specifically, by exploiting the special structure of the ttˉbbˉ\text{t}\bar{\text{t}}\text{b}\bar{\text{b}} event data, we propose several loss functions that can be minimized to directly increase the matching efficiency, the accuracy of identifying additional b-jets. We discuss the difference between our method and another deep learning-based approach based on binary classification arXiv:1910.14535 using synthetic data. We then verify that additional b-jets can be identified more accurately by increasing matching efficiency directly rather than the binary classification accuracy, using simulated ttˉbbˉ\text{t}\bar{\text{t}}\text{b}\bar{\text{b}} event data in the lepton+jets channel from pp collision at s\sqrt{s} = 13 TeV.

View on arXiv
Comments on this paper