Analysis Tutorial #7: Event selection 2

In the previous example we implemented two simple boolean cuts to select events with same-flavour or different-flavour final state. Now we will implement cut on a continuous variable.

When tau leptons decay leptonically, the decay is always accompanied by neutrinos which are invisible to the detector. Therefore, we only reconstruct the two light leptons, electrons or muons. We can calculate the invariant mass of this visible lepton pair, however, it will not  correspond to the mass of the mother Higgs boson because part of the momentum escaped in the form of neutrinos. However, this variable is can still be useful and we use it to illustrate how to implement cut on a continuous variable.

We want the cut to behave in a similar way as in case of the “cut_selectSF/DF” cuts. We want the cut to only be applied if it is explicitly set in the Python class, otherwise we want the cut skipped. One way to do this would be to create a c++ class Algorithm attribute of type “double” which would have a default value that is so large or small that the event is always accepted. However, for some variables such a value is impossible to find. Also, one can easily make a mistake with the implementation. So instead, we use classes and operator-overloading technique to implement such cuts.

Firstly, let’s make a new class called “Cut”. The header file “Cut.h”:

#ifndef CUT_H
#define CUT_H

class Cut {
 public:
  /**
  * @brief Construct a new uninitialized cut
  */
  Cut();

  /**
  * @brief Sets the cut value
  */
  void set(double value);

  /**
  * @brief Returns true if value of this cut was set
  */
  bool enabled();
 
  /**
  * @brief Overloaded comparison operators
  */
  bool operator>(const double& rhs);
  bool operator<(const double& rhs);
  bool operator>=(const double& rhs);
  bool operator<=(const double& rhs);
  bool operator==(const double& rhs);
  bool operator!=(const double& rhs);

 protected:
  /**
  * @brief Indicates whether cut was initialized
  */
  bool m_enabled;

  /**
  * @brief The cut value
  */
  double m_value;

};

#endif
  • The class has protected attribute “m_value” that holds the cut value.
  • The class has overloaded comparison operators. So we can compare the instance of the Cut variables with numbers like if it was a simple number:  cut > 23.; cut <=20; etc…
  • When the Cut instance is created using the default constructor, it’s attribute “m_enabled” is set to false and the value is set to 0.
  • We can set the cut value using the “set” method, its value is initialised and the “m_enabled” attribute is set to true.
  • In the c++ code, we can now use instances of the Cut class like this:
if(cutMinValue.enabled() && cutMinValue > some_variable) return false;

The logic is the same same as when we implemented the simple cut_selectSF/DF cuts. When the cut was not initialised in the Python class, its “enabled” method returns false and the selection is not applied. If we set the value of the cut, “enabled” returns true and we evaluate the cut expression.

The source file “Cut.cpp” will look like this:

#include "Cut.h"

Cut::Cut() : m_enabled(false), m_value(0) {
 // the cut is uninitialized
}

bool Cut::enabled() {
 return m_enabled;
}

void Cut::set(double value) {
 // set the cut 
 m_value = value;

 // enable the cut. Now it is safe to use this cut.
 m_enabled = true;
}

bool Cut::operator>(const double& rhs) {
 return m_value > rhs;
}

bool Cut::operator<(const double& rhs) {
 return m_value < rhs;
}

bool Cut::operator>=(const double& rhs) {
 return m_value >= rhs;
}

bool Cut::operator<=(const double& rhs) {
 return m_value <= rhs;
}

bool Cut::operator==(const double& rhs) {
 return m_value == rhs;
}

bool Cut::operator!=(const double& rhs) {
 return m_value != rhs;
}

Now we also have to update the c++ class Algorithm and add some new cut(s). “Algorithm.h”:

#ifndef ALGORITHM_H
#define ALGORITHM_H

#include "TString.h"
#include "TH1D.h"
#include "Data.h"
#include "Cut.h"

class Algorithm {
  public: 
  /**
  * @brief Construct a new Event Loop object
  */
  Algorithm();

  /**
  * @brief Initialize the algorithm
  * 
  * @param data - pointer to the data-access class instance
  */
  void initialize(Data* data);

  /**
  * @brief Execute. Here the stuff happens
  */
  void execute();

  /**
  * @brief Pointer to the histogram class instance defined as public attribute
  */
  TH1D* h_ditau_m = 0;
  TH1D* h_ditau_visM = 0;

  /**
  * @brief Selection cuts
  */
  bool cut_selectSF = false;
  bool cut_selectDF = false;

  /**
  * @brief Selection cuts implemented using the "Cut" class
  */
  Cut cut_maxVisMass;


 protected:

  /**
  * @brief Apply the selection
  * 
  * @return true when event passed the selection.
  */
  bool passedSelection();

  /**
  * @brief Fill the histograms
  */
  void fillPlots();

  /**
  * @brief Instance of the data-access class
  */
  Data* m_data = 0;

};

#endif
  • We have included “Cut.h” header file and added a new cut “Cut cut_maxVisMass”. This is a cut on the maximum value of the invariant mass of the visible decay products.
  • We have also added a new histogram called “h_ditau_visM” so that we can see if the cut took an effect. Generally, it is always a good idea to plot variables one uses for event selection to be able to check that everything works as expected.

In source file we implement the cut expression and fill the new histogram. “Algorithm.cpp”:

#include "Algorithm.h"

Algorithm::Algorithm() {
}

void Algorithm::initialize(Data* data) {
 m_data = data;
}

void Algorithm::execute() {

 // apply selection. Exit if didn't pass
 if(!passedSelection()) return;

 // fill the plots
 fillPlots();

}

bool Algorithm::passedSelection() {
 // select specific flavor combinations
 if(cut_selectSF && m_data->tau_0!=m_data->tau_1) return false;
 if(cut_selectDF && m_data->tau_0==m_data->tau_1) return false;

 // select invariant mass of the tau lepton visible decay products
 if(cut_maxVisMass.enabled() && cut_maxVisMass < (*(m_data->tau_0_p4) + *(m_data->tau_1_p4)).M()) return false;

 // passed all the cuts
 return true;
}

void Algorithm::fillPlots() {
 // here we fill the histograms. We protect the code against the empty pointers.
 if(h_ditau_m) h_ditau_m->Fill( m_data->ditau_mmc_mlm_m );
 if(h_ditau_visM) h_ditau_visM->Fill( (*(m_data->tau_0_p4) + *(m_data->tau_1_p4)).M() );
}
  • The invariant mass of the visible decay products is calculated using 4-vector variables “tau_0_p4” and “tau_1_p4” we have implemented in our “Data” class earlier.
    In special relativity the invariant mass is calculated as
    M^2 = P^2 = (P1 + P2)^2
    where P, P1, and  P2 are 4-vectors. In ROOT, 4-momenta are implemented using a TLorentzVector class which offers all the needed operations. So we simply need to add “tau_0_p4” and “tau_1_p4” variables and then call “M()” method on the result. However, since these variables are of type “pointer to TLorentzVector” (i.e. TLorentzVector*) they have to be dereferenced first, i.e. we have to go from pointer to the value. This is in c++ done using the “*” operator:
    *(m_data->tau_0_p4) removes pointer from the first lepton 4-momentum variable
    *(m_data->tau_1_p4) removes pointer from the first lepton 4-momentum variable
    Then they can be added and “M()” called:

     (*(m_data->tau_0_p4) + *(m_data->tau_1_p4)).M()
  • The same thing is performed in the “fillPlots” method. This is a bit sloppy, it would have been better to calculate the value once and then keep it as an attribute of the class to be used again later, but I was lazy :-)

Now we have everything needed in c++. Do not forget to add this class into the Makefile and Linkdef.h files and recompile the c++ code:

Finally, we add the cut into the Python configuration class. Modify “Algorithms.py”:

from Algorithm import Algorithm
from ROOT import TH1D

# -----------------------------------
class AlgDefault(Algorithm):
 """ Default set of histograms
 """
 def __init__(self, name = "AlgDefault"):
  # call inherited constructor
  Algorithm.__init__(self, name)

  # create histograms
  self.alg.h_ditau_m = TH1D("ditau_m", ";#it{m}_{#tau#tau} [GeV];Events", 30, 0, 300)
  self.alg.h_ditau_visM = TH1D("ditau_visM", ";#it{m}_{ll} [GeV];Events", 30, 0, 150)

# -----------------------------------
class AlgSF(AlgDefault):
 """ Select only same-flavor leptons
 """
 def __init__(self, name = "AlgSF"):
  # call inherited constructor
  AlgDefault.__init__(self, name)

  # apply selection for the same-flavor lepton combination
  self.alg.cut_selectSF = True
 
  # only select events with the invatiant mass 
  # of the visible tau decay products up to 75 GeV
  self.alg.cut_maxVisMass.set(75.)

# -----------------------------------
class AlgDF(AlgDefault):
 """ Select only different-flavor leptons
 """
 def __init__(self, name = "AlgDF"):
  # call inherited constructor
  AlgDefault.__init__(self, name)

  # apply selection for the different-flavor lepton combination
  self.alg.cut_selectDF = True
  • We have added the cut into the “AlgSF” class and set it to 75.
  • Although this implementation is meant as an exercise, there is a physics reason for implementing this cut in this way. An important background to the “H->tautau->ll” process comes from the leptonic decays of the Z boson “Z -> ll”. This decay does not have neutrinos, so when you calculate the invariant mass in this case you will get a number close to the nominal mass of Z boson, 90 GeV. Therefore, removing events with visible mass above 75GeV suppresses greatly this type of background. Since Z boson can only decay into leptons of the same flavour, ee or μμ, it only makes sense to apply this cut in the same-flavour algorithm.

Now we can run the code and check the result. When we plot the “ditau_visM” histogram from AlgDefault (where cut was not applied) we see the full mass spectrum:

Screenshot 2021-07-15 at 16.52.52

However, the plot from “AlgSF” has the upper tail above 75 GeV cut off as we expect:

Screenshot 2021-07-15 at 16.53.31

Next section →