Mēnsūra
Classes | Public Types | Public Member Functions | List of all members
Dataset Class Reference

A class to store information about a dataset. More...

#include <Dataset.hpp>

Classes

struct  File
 Aggregates path to a file, cross section, number of events in the parent dataset, and mean generator-level weight. More...
 

Public Types

enum  Type { Data, MC }
 A type to distinguish collision data and simulation.
 
enum  Generator {
  Undefined, Nature, Pythia, MadGraph,
  aMCatNLO, POWHEG, CompHEP, SHERPA
}
 Defines the supported generators for the hard process.
 
enum  ShowerGenerator { Undefined, Nature, Pythia, Herwig }
 Parton shower and hadronization generator.
 
enum  Process {
  Undefined, Process::ppData, pp7TeV, pp8TeV,
  pp13TeV, Process::BSM, Process::tHq, Process::tHqExotic,
  Process::tHqSM, ZPrime, WPrime, Process::ttbar,
  Process::ttInclusive, Process::ttSemilep, ttDilep, Process::ttHad,
  Process::SingleTop, ttchan, tschan, ttWchan,
  ttH, Process::EWK, Wjets, Process::Diboson,
  DrellYan, QCD, Process::Photon
}
 Code to describe physics process represented by the dataset. More...
 

Public Member Functions

 Dataset () noexcept
 Constructor with no parameters.
 
 Dataset (Type type, std::string sourceDatasetID="")
 Create an empty dataset with the given type and source dataset ID. More...
 
 Dataset (std::list< Process > &&processCodes, Generator generator=Generator::Undefined, ShowerGenerator showerGenerator=ShowerGenerator::Undefined) noexcept
 Constructor with parameters. More...
 
 Dataset (std::list< Process > const &processCodes, Generator generator=Generator::Undefined, ShowerGenerator showerGenerator=ShowerGenerator::Undefined) noexcept
 Constructor with parameters. More...
 
 Dataset (Process process, Generator generator=Generator::Undefined, ShowerGenerator showerGenerator=ShowerGenerator::Undefined) noexcept
 Constructor with parameters. More...
 
 Dataset (Dataset const &)=default
 Default copy constructor.
 
 Dataset (Dataset &&) noexcept=default
 Default move constructor.
 
Datasetoperator= (Dataset const &)=default
 Default assignment operator.
 
void AddFile (std::string const &path, double xSec, unsigned long nEvents, double meanWeight=1.)
 Adds a new file to the list. More...
 
void AddFile (std::string const &path)
 A short-cut of the above method to be used with data.
 
void AddFile (File const &file) noexcept
 Adds a new file to the list.
 
std::list< File > const & GetFiles () const
 Returns the list of the files.
 
std::string const & GetSourceDatasetID () const
 Returns label that uniquely identifies the source dataset.
 
Generator GetGenerator () const
 Returns the hard-process generator.
 
ShowerGenerator GetShowerGenerator () const
 Returns parton-shower and hadronization generator.
 
Process GetProcess () const
 Returns the most specialised process code. More...
 
std::list< Process > const & GetProcessCodes () const
 Return a list of all assigned process codes.
 
bool TestProcess (Process code) const
 Tests if the given process code is assigned to the dataset.
 
bool IsMC () const
 Checks if the dataset corresponds to the simulation and not the real data. More...
 
Dataset CopyParameters () const
 Creates a clone of this dataset with an empty file list.
 
void SetFlag (std::string const &flagName)
 Sets a flag with given name. More...
 
void UnsetFlag (std::string const &flagName)
 Unsets the flag with given name. More...
 
bool TestFlag (std::string const &flagName) const
 Tests if the flag with given name is set.
 

Detailed Description

A class to store information about a dataset.

This class aggregates all basic properties of a dataset, most notably it contains a list of input ROOT files together with information needed for normalization of simulated datasets (cross sections, total numbers of processed events, and mean weights). A flag distinguishing between experimental data and simulation is also stored.

Each dataset must be assigned an arbitrary label that uniquely identifies the source dataset from which the files were produced. Physics content of files within the same dataset or multiple datasets with identical labels is the same. If user does not specify this label, referred to as source dataset ID, it is deduced automatically from the name of the first input file.

User can add arbitrary boolean flags to provide additional information about the dataset to custom plugins.

Currently a number of properties are described with enumerations Process, Generator, and ShowerGenerator. This functionality is deprecated and will be eliminated in future.

Member Enumeration Documentation

enum Dataset::Process
strong

Code to describe physics process represented by the dataset.

Specialised process codes must always be described after the corresponding more general codes so that when the enumeration values are converted into integer numbers more general categories get smaller numbers.

Enumerator
ppData 

Generic category for any real pp collisions.

BSM 

Common category for BSM signals.

tHq 

tHq with any couplings

tHqExotic 

tHq with kappa_t = -1

tHqSM 

Standard-Model tHq (kappa_t = +1)

ttbar 

Generic category for any ttbar dataset.

ttInclusive 

Inclusive ttbar, i.e. the dataset contains any decays.

ttSemilep 

Exclusive semileptonic ttbar.

ttHad 

Exclusive hadronic ttbar.

SingleTop 

Generic category to describe any single-top process.

EWK 

Generic category for production of W and Z bosons.

Diboson 

WW, WZ, or ZZ.

Photon 

Prompt-photon production.

Constructor & Destructor Documentation

Dataset::Dataset ( Dataset::Type  type,
std::string  sourceDatasetID = "" 
)

Create an empty dataset with the given type and source dataset ID.

If an empty label is given to the source dataset ID, it will be deduced from the name of the first file when added.

Dataset::Dataset ( std::list< Process > &&  processCodes,
Generator  generator = Generator::Undefined,
ShowerGenerator  showerGenerator = ShowerGenerator::Undefined 
)
noexcept

Constructor with parameters.

A dataset can be assigned more than one process code. In this case the codes must be logically compatible (a dataset cannot be both Wjets and DrellYan). If a specialised code is used, the most general code of the category must also be specified. For example, if a dataset is semileptonic ttbar, it must be described as (generic) ttbar as well.

By default, generator and shower generator are set to Undefined. However, if the process is real data, they are changed to Nature.

Dataset::Dataset ( std::list< Process > const &  processCodes,
Generator  generator = Generator::Undefined,
ShowerGenerator  showerGenerator = ShowerGenerator::Undefined 
)
noexcept

Constructor with parameters.

A specialised version used when the list of process codes cannot be given as an rvalue. Refer to documentation of the first version of constructor for details.

Dataset::Dataset ( Dataset::Process  process,
Dataset::Generator  generator_ = Generator::Undefined,
Dataset::ShowerGenerator  showerGenerator_ = ShowerGenerator::Undefined 
)
noexcept

Constructor with parameters.

This version is intended for backward compatibility and for a bit of syntax sugar when a dataset is assigned a single process code. Refer to documentation of the first version of constructor for details.

Member Function Documentation

void Dataset::AddFile ( std::string const &  path,
double  xSec,
unsigned long  nEvents,
double  meanWeight = 1. 
)

Adds a new file to the list.

The file name part of the given path can contain wildcards '*' and '?'. Consult documentation for ExpandPathMask for details.

Dataset::Process Dataset::GetProcess ( ) const

Returns the most specialised process code.

Technically, the last process code is returned because the list is ordered from most general to most specialised specification.

bool Dataset::IsMC ( ) const

Checks if the dataset corresponds to the simulation and not the real data.

Note that the discrimination is performed based on the value of the first process code, and if it is undefined, then method returns true.

void Dataset::SetFlag ( std::string const &  flagName)

Sets a flag with given name.

If a flag with given name has already been registered, an exception is thrown. Note that internally an unordered_set is used to store names of flags, and, therefore, the flags are compared with the help of std::hash<std::string> function, which is not guaranteed to produce different hash values for two strings that differ well according to human intuition.

void Dataset::UnsetFlag ( std::string const &  flagName)

Unsets the flag with given name.

If the flag is not found, method takes no effect.


The documentation for this class was generated from the following files: