Code & Data

(see Disclaimer)

This document contains a description of the iniData class, a description of NLD digitization, a brief description of the GHEISHA package in GISMO, and HEP data formats in our NLD project.   I also include some comments concerning the general state of the project and some tricks we've learned along over the past 8 months.

iniData

The functionality of this class actually involves two classes: Specs and iniData.  The business end of iniData parses a user defined resource file which must contain certain running specifications for the NLD application of GISMO. 

Specs is a storage class representing the abstraction of user defined data.  Each spec is added to an STL container type called list.  So each iniData::list contains n Specs type objects where Specs member data (m_qualifier, m_mat, m_field) are assigned user defined values. Specs::m_qualifier is used as an anchor when parsing the file and is expected in the code (see iniData::ProcessFile()).  For Specs::m_material, iniData::ProcessFile() distinguishes between sensitive and radiative materials by switching the bool Specs::m_sense from false to true, a value distinguished in calorimeter subsystems.

iniData objects store a list of Specs and expect to be themselves contained by the objects that call them (see Upgrade).  iniData::m_name is a reserved string variable that iniData::ProcessFile() expects to "head up" certain data field entries. 

constructors:

- default gives access to all interface functions which doesn't really mean much without having processed a file (state is default!).

- iniData(string SSname, int SStype =0) is essentially a call to ProcessFile() casting iniData::m_name along the way.  SStype is cast as an enumerated type which the code switches on when seeking data of a certain format.   Consider iniData mydata("EM_Barrel").

Since SStype= 0, iniData will process the file as SPECS, "a Common Acquisition" (see iniData::CommonAcq()).  Provided iniData finds that header, it will expect file entries of the form:

[m_qualifier] = [m_field] (1)

until it encounters the end;

interfaces:

- string CalName()const: unused for the most part.  Just returns iniData::m_name

- const SpecList element(): returns iniData::SpecList, mostly for the benefit of the CalSystem::BarrelLayers()/EcLayers(). 

- float GetSpec(string qualifier, int region=0): call for a generic spec which in the ini file looks like.  region allows the SubSystems to distinguish between Barrel, NEC, and SEC.

- int phi_segmentation (int region) (tan_lambda_segmentation(), max_tan_lambda()): essentially a call to a segmentation VECTOR where region steps the iterator::advance() in scope.  These functions are redundant of course, and one might call iniData::GetSpec() directly, but for those reading the CalId classes I thought this might be a little nicer to look at.

ProcessFile() and Find() are worth mentioning in a little more detail:

void ProcessFile(int SStype): the working end of iniData, implementation understands 3 formats: SPECS, VECTOR, and MATERIAL where all other TypeSwitches fall through (see iniData.cxx).  iniData::AddSpec() is called here unless VECTOR where iniData::VectorSplit() is used instead.  It uses string::getline() to buffer the filestream.  A "line" is space seperated by escape sequences (\r, \n, etc).

list<Specs>::iterator Find(string key): might seem redundant given set<Specs>::Find() might do the same thing, but the CalSystem *Layers() functions depend on the FILO default ordering of a list for their recursive build of a radiator/sense cell.  <set> uses the functional less<Specs> to order object elements where user defined order is what is important.  The STL container is probably a better choice of container but I couldn't work out an appropriate predicate, so there you have it.

exception handling: Tried to incorporate some friendly error conditions but the current version of g++ does not support it.  Tried -fhandle_exception option and was able to compile try, and catch, but ld failed and the project wouldn't build.  I left the error handling commented to illustrate where exceptions should occur, to ensure proper acquisition and correct data object states. 

I've also left Errors.h/cxx in both NT and AIX projects which define the Fail and Warn classes.  They are trivial at the moment but can become very sophisticated and informative to the user.  On NT, they could replace the static Arve level functions FAIL() and WARN() and mimic their functionality exactly.

Digitization

Digitization is the mechanism used in GISMO to store the energy lost in each tower of a calorimeter.  This process involves three modules (primarily): nld, gismo, and geometry.  There is a lot more overhead involved in making things go but I'm going to concentrate on these three.  Before I go on, let's agree that when I mention any of the *Cal* classes I am referring to either EM or HAD or both since they are clones.  So when I say "EMCALID does so-and-so" I am referring to both EMCALID and HADCALID.  I'll do my best to use *CAL* but if I goof up and use one or the other be assured that EM goes as HAD (or vice-versa) for intents and purposes.

Might as well clean up a few more terms right now as well.  When I say layer I mean a radiator-sense-radiator cone/cylinder (endcap/barrel) sandwich as defined recursively in *CAL*::*Layers() methods.  A tower is created by segmenting a layer and is sort of like a 3-D cube sandwich.  Imagine a segmented layer as graph paper piled radiator-sense-radiator high, then wrapped as cylinder or a cone.  That's the analogy I use for what it is worth.  Keep in mind that Digitization takes place only for Specs::sensitive()= true for the current Specs::m_material, everything else is a radiator in this model.

The players in this Digitization game are grouped here in their respective modules:

nld: EMCALSYSTEM, EMCALID, HADCALID, HADCALSYSTEM, CELLARRY

gismo: MCPARTICLE, GPARTICLE, DETECTOR, DIGITIZERINTERFACE

geometry: DIGITIZER, COORDSYSTEM, GEOMOBJECT, RAY, POINT, VECTOR

where the abstract base classes are: DETECTOR, DIGITIZER, GEOMOBJECT.  I'm going to cover this process in 2 parts: initialization of players and state change mechanism of the map. 

The subset of players involved in initialization is: *CAL*, CELLARRAY, DIGITIZER, DIGITIZERINTERFACE, DETECTOR.  During initialization a signal is issued to Arve from the NLD module to create a new Digitizer, a template class CellArray parameterized by a map<EMCALID, float, less<EMCALID>> and a layer number.  In EMCalSystem::BarrelLayers() the instruction is:

Digitizer* digi = new CellArray<EMCalId, float>(*(this->cells()),layer);

(see EMCalSystem.cxx).  A new DigitizerInterface is then constructed given this digitizer which establishes the interface (naturally) between Arve Medium object and the application Event Source.  I'm not going to go into the detail of just what a Medium and Event Source are at this point, I'll only say that a Digitizer (hence a Detector) is stored in the Medium object and accessed at some later date. 

If you look at the figure, DigitizerInterface is clearly an agent of the player modules.  The *CalSystem object orchestrates allocation of CellArray objects, Digitizers in the Arve abstraction, and through DigitizerInterface marries digitized Mediums to GISMO detectors.

The important thing to notice is that we have initialized a map that can contain EMCalId objects.  If you consider my conceptual diagram, it looks as though EMCalId will contain all the data from the reconstructed hit (position given segmentation, layers) and acting as a key to be used by the analysis accounting for MC particle energy loss.  Position info then is folded into EMCalId::m_id  which is in turn "unpacked" by NTUPLER into a more accessible data structure for analysis.  Note also that the map itself is contained by the CalSystem object and that Medium objects are passed the address of this map.  When called upon to act as a sensitive medium (a Digitizer) the hit is registered in a *CalSystem map.   Until a hit is registered, the map is empty.

So how is a hit "registered" anyway?  I'm calling this the state change mechanism for our map and have illustrated what I think happens.  The players in this mechanism are: MCPARTICLE, DETECTOR, DIGITIZERINTERFACE, DIGITIZER, CELLARRAY, and *CALID.  Some repeat performers only *CALSYSTEM is dropped and replaced by MCPARTICLE.

An important aspect of this process is how our map is actually updated by the Digitized Medium.  My sketch demonstrates how Digitizer CellArray does this.  Through CellArray::addHit() an EMCalId object is constructed contains an m_id data member calculated from the layer number and user defined segmentation.  This object becomes an element in the controlled sequence, a process conducted 'n' times, whenever MCParticle requests it.

Take a look at MCParticle::propagate().  After the Status of particle is non-ALIVE, a Detector object is score() provided the currentMedium is a Detector.  The DigitizerInterface implements score() for whichever Medium Detector is in scope and calls a Digitizer::addHit().   The addHit() is resolved by an NLD CellArray which updates the map as suggested in my sketch.  That's it!  The MCParticle static member data type Ray provides a check for validating the hit (verifying the dip angle), the float type MCParticle::startingEnergy is divided nstep ways, and the resulting ratio is stored along with the appropriate *CalId type.  I think my sketch of this process sums it up quite well, have a look.

GHEISHA

GISMO contains a Gheisha class and a Monte Carlo package named GHEISHA.  The Gheisha object gives one perspective on a more global process which leads to the interaction of particles in the GISMO package.  The players are: GPARTICLE, MCPARTICLE, GHEISHA, PDATA, and INTERACTOR.  I'm going to focus on GParticle and MCParticle here and sort of gloss over the others citing Gheisha as an implementation.

You might think of MCParticle as a GParticle slave where GParticle is sort of the brains of the operation.  GParticle initialization consists of a message, sent via MCParticle, identifying a TestBeam particle (constructors which ultimately lead to a PDT::lookUp(), see the code).  Depending on the mode (wingui/motif or console/batch) an event is started and MCParticle::propogate() is used to pass Medium objects to the generator.

This is a good place to stop and specify the relationship between MCParticle and GParticle.  MCParticle implements GParticle virtuals propogate(), draw(), printOn(), and addChild().  This is grunt work for the most part and whenever an MCParticle object  needs information about just what a particle is it consults GParticle.  It does this through inherited functions like momentum(), mass(), interacted() and numChildren.   The coupling for GParticle lies in functions like setMometum() where MCParticle informs GParticle that something has happened on the Medium side of things (Status is STOPPED, or MCParticle::stepBy() has been called in a propogation). 

"..on the Medium side of things..." should be explained in a little more detail.  I see GParticle as the nitty-gritty physics side of a propogating particle.   It contains PData objects, which in turn contains Interactors (like Gheisha and EGS).  GParticle inherits functionality from the Class Library for HEP (the CLHEP module) and actually uses it on behalf of MCParticle if necessary.  MCParticle on the other hand contains geometry objects like Ray, it communicates with the Medium object directly thereby creating an interface with the graphics and nld modules.   MCParticle brings physics (GParticle) to the user defined world of Material.

Back to the Medium object that has just been passed to MCParticle.  Assume a TestBeam p has been specified for GParticle: say a 5 GeV p+.  MCParticle::propogate() propogates this guy, stepping it by nucleii, calculating energy loss until a GParticle interaction length has been reached or the particle's time has otherwise expired.  Let's say this guy interacts (Status= INTERACTED).  Through the inherited function interacted(), MCParticle asks GParticle what interactor it has designated for the task of calculating shower particles.   Since we have assumed a pion, the Gheisha object (stored by the PDT object as a vector element) will implement Interactor::interact().

Gheisha drives the GHEISHA package via gheish() which results in 1 or more shower particles.  The original p would be referred to as the parent at this point, the resulting shower particles as children.   Gheisha informs GParticle of it's work calculating the children:

in->addChild(particleType)
        ->setMomentum(Vector(pProd[3*i], pProd[3*i+1], pProd[3*i+2]));

where in= GParticle*, particleType= GParticle::thePDT()->lookUp(type= PDG particle code)), and pProd[]= an array of floats denoting relative position of child shower particle.  MCParticle::addChild() defers to GParticle::addChild() which returns a child GParticle object.  The child then sets it's momentum given GHEISHA pProd[] elements.  

Each GParticle object is either a parent or a child.  A child object refers to it's parents but not to other children.  A parent object contains a list of children (GParticle** children) and a pointer to it's parent.  Children become parents when they are propogated in their turn.  This recursion results in a tree of parent nodes and child leaves, each subtree a shower of the relative root node.

Data Status

What is a *.hbook, a PAW Ntuple?  I believe an Ntuple is a repeated data structure, perhaps composed of many nested data structures, one for each HEP event.   What does that mean for GISMO?  Have a look at ntuple.f, there you will see the key word COMMON.   What is COMMON /NTCOM/?  Various integers and real numbers with several thought provoking names such as EETOT, NECAL, IHLAYER, etc.   NTCOM is the repeating data structure here, a very simple "set" of integers and floats representing a GISMO reconstructed HEP event.  The *.hbook file NTUPLER creates contains an NTCOM for each event (IEVT=1, 2, ... n).

So NTUPLER does a series of calculations on the left hand column of a GISMO *.dat file (*CalId::m_id entries in fact) and creates this Ntuple format which h2root understands.  Rather than go through this process it has been suggested that we output a *.root directly from GISMO thereby taking advantage of ROOT types as the data is being reconstructed.  At this time that is still only a proposal.

We have been working on interfacing GISMO with a full physics event generator.   The StdHep libraries provide routines to convert different Monte Carlo event formats into a P.D.G standard event format.  The manual for this package is on my user disk (S:/chadc/generators/manuals) and the directory structure of the package is on the NLD site space.  There are conversion routines specific to ISAJET and others (see Sec 10).   

State of NLD

On AIX, the Makefile paths are stored in MAKEINFO: header info for all GISMO project Makefiles (libraries and NLD).  As mentioned in iniData, Errors.cxx/.o is part of the build even though the classes are never used.  You might want to yank them I don't know.

There's also a README on ~chadc/gisdev/gismo/nld which outlines how to use make to build 2 different executables: nld_bat and nld_win where nld_bat is batch mode, nld_win Motif.

The command line arguments passed to GISMO, in the NT console project as well as AIX batch, order is very important.  It is noted in the code but I'll reiterate here:

nld_bat [user *.ini file name] [particle name] [cos q] [F] [# of events] [output file name]

on AIX where F is radians.  Replace nld_bat with conld on NT.  It's important that these don't get mixed up as unexpected data may result and might be difficult to detect (experience talking here).  When building nld_win make sure processArgs() has a valid *.ini file path or I'm not sure what will happen

There's a README on ~chadc/ntupler which outlines a little protocol of mine to keep NTUPLER versions straight.  We needed to make some minor changes to NTUPLER to convert SLD data and to increase the number of events converted.

Disclaimer: What you will find here is simply my understanding of the topics.  Ignore everything that confuses or contradicts what you find.  I think that I have learned a little bit during my time here, but then again, I may have learned nothing and leave that up to you to decide.


C.Colgur  Last Updated: 01/13/04 13:01