Root IO for Java
Using JAS to Analyze Root Data
Contents
|
|
|
|
|
Root IO for Java |
|
Goals and Motivation |
|
Using the Package |
|
Demo apps |
|
Accessing Root Files from your own Java
programs |
|
Possible uses |
|
Using JAS to analyze Root Data |
|
Future enhancements to JAS |
|
Root IO Implementation |
|
Current Implementation |
|
Limitations of current implementation |
|
Future Plans for ROOT IO |
|
Conclusions |
|
|
Root IO
|
|
|
A Pure Java Implementation |
|
Part of the FreeHEP Java Library |
Goals
|
|
|
|
|
|
|
Provide a pure-Java package for reading
Root files. |
|
Could be extended to writing later |
|
Should work with any Root file |
|
Should not need to know about objects
ahead of reading (no need for dlls, .so files) |
|
Provides access to data from Root file,
not methods of C++ objects stored in file. |
|
Suitable for data-centric objects |
|
Mini DSTs, NTuples, Raw Data |
|
User can provide own implementation of
object |
|
Easy to use |
|
Efficient (at least in later
implementations) |
|
Goal is to be as efficient as native
Root IO |
Why Bother Root already
exists?
|
|
|
|
|
|
Philosophical Reasons |
|
If we are committing a large amount of
HEP data to Root, it is good to know it can be read back even without Root |
|
Java package is currently only 5000
lines of code. |
|
We need it for JAS, Wired etc. |
|
Calling C++, Fortran code + interface
to Java is bigger overhead (c.f. Paw, StdHEP). |
|
Returns all the problems we got away
from by using Java |
|
Porting issues (E.g. MacOS) |
|
Crashes |
|
Java Applets etc. |
|
Security considerations may not allow
native code |
|
|
Demo: Root Object Browser
Demo: Root Histogram Browser
Interface Builder
|
|
|
|
Package can read Root file with no
a-priori knowledge of contents |
|
Great for systems which use scripting
or reflection to get information about objects read. |
|
If you want to compile code against
user-defined objects in Root file must use InterfaceBuilder |
|
java hep.io.root.util.InterfaceBuilder
<rootfile> |
|
Builds Java interfaces for all
user-defined objects found in file |
Example of Generated
Interface
Example of Reading Root File
in Java
Possible Uses
|
|
|
|
Java based online monitoring |
|
Java based Event Display |
|
EG WIRED an experiment independent
event display toolkit written in Java |
|
Web based histogram browser |
|
Applet based (Java runs in browser) |
|
Servlet based (Java runs in server) |
|
Java based data analysis (e.g. JAS) |
|
Script based data analysis |
|
Jython, Beanshell, Dynamic Java
. |
Using JAS to Analyze Root
Files
|
|
|
|
|
Brief Overview of JAS |
|
Why use JAS for Root Analysis |
|
Analyzing Root file with JAS |
|
Demo |
|
Plugin Manager download Root
extensions |
|
Opening a Root File |
|
Using the Object Browser |
|
Creating and Filling Histograms |
Introduction to JAS
|
|
|
|
Pure Java Analysis Environment |
|
Data Format Independent |
|
Modular/Extensible via Plugins/Data
Interface Modules |
|
Rich Easy to use GUI |
|
Built in editor/compiler for writing
analysis code |
|
Local and Client-Server Operation |
|
Originally targeted at offline analysis
but also used extensively for online monitoring |
|
Written entirely in Java |
JAS GUI
JAS Plotter
JAS Editor/Compiler
Extensible via Plugins
|
|
|
|
Plugins can: |
|
Define experiment specific utilities
(event display, analysis utilities, specialized tables). |
|
Define data interfaces to handle new
types of data. |
|
Define new plotting routines (e.g. to
display special display). |
|
Add menus, create control areas,
consoles, and output pages. |
|
Plugins will be more flexible in JAS
3.0 (see discussion of FreeHEP application framework, later). |
|
|
Examples of Plugins
JAS+Wired
Data Format Independent
|
|
|
|
|
Unlike |
|
Root (requires Root files) |
|
PAW (requires PAW files) |
|
JAS is data format independent |
|
Special type of Plugin, a Data
Interface Module (DIM), reads data and makes it available for analysis in JAS |
|
DIMs exist for |
|
PAW, StdHEP, FlatFiles, SQL database
(JDBC), Objectivity (HepTuples), Root |
|
Several experiment specific data
formats |
|
You can write your own DIM for your
data format |
|
|
Remote Data Access
|
|
|
|
Rather than transporting Peta-bytes of
data to the physicist |
|
Transport the physics analysis code to
the data |
|
Transparently - so that it feels just
like local data access |
|
Just ship histogram contents back to
the physicists desktop (on demand) |
|
Allows remote analysis with modest
network bandwidth |
|
Allows user to feel as if using local
machine even when accessing remote data. |
Why use JAS for Root
Analysis?
|
|
|
|
|
|
Root already has great analysis tools! |
|
Why use JAS? |
|
If you (and your users) are 100% happy
with Root |
|
No reason to change or try alternatives |
|
Java is a good alternative to C++ |
|
Java is simpler to learn and use than
C++ |
|
Not everyone who wants to do data
analysis is a C++ guru or wants to become one |
|
The robustness of a scripting language |
|
Impossible to crash program using Java
(or python etc) |
|
The performance of a compiled language |
|
JAS is still newer than root, but more
plugins |
Using JAS for Root Analysis
|
|
|
|
Demo |
|
See writeup at |
|
http://java.freehep.org/lib/freehep/doc/root/rootjas.shtml |
JAS Plans
|
|
|
|
|
|
Current release is 2.2.4 |
|
Expect to continue to release 2.2.5 etc
with incremental improvements. |
|
More plugins coming: |
|
Neural network plugin |
|
Multivariate analysis |
|
AIDA Abstract Interfaces for Data
Analysis |
|
Also working on JAS 3 |
|
Larger overhaul of JAS
architecture/functionality |
|
Scripting support (Jython?) |
|
AIDA Histograming/Ntuples/etc. |
|
Use FreeHEP application framework |
|
JAS, WIRED, will be plugins into
framework. |
|
NTuple explorer |
Root IO In Java:
Implementation
|
|
|
Current Implementation |
|
Limitations of Current Implementation |
|
Future Plans for Root IO |
Methodology
|
|
|
Very little documentation exists on
Root internals. |
|
To create IO package involves a reading
Root code and reverse engineering |
|
Many features a lot of trial and
error, need lots of test files |
|
Dual track approach
. |
Anatomy of a Root File
|
|
|
|
|
|
|
Root File is a Random Access Object
Store |
|
Objects in file can be looked up by
Key |
|
Key is a String. |
|
Each key can correspond to a hierarchy
of linked objects |
|
TTree objects are special |
|
Can contain multiple branches |
|
Each branch contains |
|
More branches |
|
A set of objects (e.g. Events, Tracks
etc). |
|
TTree objects provide random access to
events, and allow reading only a subset of branches for efficiency. |
Anatomy of a Root File
|
|
|
|
|
Starting with Root 3.0 each file
contains a special key StreamerInfo |
|
Contains a collection of TStreamerInfo
objects which contains information on data members of all objects in file.
Allows: |
|
Reading root files without the original
code |
|
Reading root files with older versions
of objects (schema evolution) |
|
Root files are now self describing |
|
This allows Java program to read files
without accessing compiled C++ code. |
Implementation
|
|
|
|
|
|
RootFileReader is used to open file. |
|
Understands how to find Keys and
Streamer info in file |
|
As objects are read from file |
|
Delegate to RootClassFactory to create
objects |
|
Normally use DefaultClassFactory |
|
Can be user provided (or extended) |
|
Each object is responsible to read its
own data |
RootClassFactory
Representations and
Interfaces
|
|
|
|
|
Representations are the internal
representation of the Root objects created by the RootClassFactory |
|
GenericRootObject is current
Representation |
|
Uses a Hashtable to store data quite
inefficient |
|
Easy to debug and fix bugs, add new
functionality. |
|
Different objects are created depending
on how object is stored in file |
|
Objects stored in TTrees typically
create hollow objects |
|
No data is read from file until it is
requested by user |
|
Hence no need to say up-front which
branches will be read |
|
|
Root Class Factory
|
|
|
|
The DefaultClassFactory looks in the
following places to create classes: |
|
For a specific Java class in the
package hep.io.root.reps (a SpecificRootObject). |
|
StreamerInfo in the file being read
used to create a GenericRootObject |
|
Streamer info in the bootstrap file
StreamerInfo.properties |
|
Info in the file typedef.properties
file used to define Java mapping for Int_t etc. |
|
|
Status/Limitations
|
|
|
|
|
Currently only supports Root 3.0 or
later |
|
Could support earlier files too, but is
it worth it? |
|
User supported objects supported so
long as they have StreamerInfo |
|
Small problem with TTree in Root 3.01,
fix coming soon. |
|
Aims to support all Root files,
including compression, splits, etc. |
|
No support yet for |
|
Chaining files, TTree split across
files, friendly TTrees. |
|
Performance |
|
Adequate for testing, event displays,
small datasets |
|
Analysis of large datasets will require
more efficient implementation of representations |
|
Need more test cases, much easier to
debug, add new functionality now rather than later. |
Future Plans for Root IO
|
|
|
|
|
Dynamically build representations |
|
StreamerInfo à
JavaByteCode à machine code |
|
Different objects depending on: |
|
How object was stored |
|
Version of object in file (schema
evolution) |
|
Expect to have this ready
October/November |
|
Use java.nio package in Java 1.4 (due
end of year) |
|
Provides more efficient IO for large
binary files |
|
Provides support for memory mapped IO |
|
Expect to get very good performance |
Common Reflection API for
C++?
|
|
|
|
|
One advantage Java has over C++ is
built-in reflection for all classes |
|
Given pointer to object can find out: |
|
What class of object it is |
|
All methods, members, constructors. |
|
Access member values, call methods and
constructors |
|
Recent Analysis Tools Meeting at CERN
attended by: |
|
Rene+Fons (Root), myself, Andeas
Pfeiffer (Anaphne), Lassi Turra (Iguana), Guy Barrand (OnX) |
|
Identified common reflection API for
C++ as a possible collaborative project |
|
If this existed, and was adopted by
Root, would make access from Java to Root files and in-memory objects much
easier. |
FreeHEP Java Library
|
|
|
|
|
|
Root IO is just one component of the
open-source FreeHEP Java library. |
|
Non-HEP specific |
|
Application Framework base for JAS 3
and Wired 2 |
|
JACO Java access to C++ Objects |
|
2D Vector Graphics generates .eps, .svg, .pdf |
|
(E)PS viewer |
|
HEP specific |
|
hep.physics package |
|
3-vector, 4-vectors and utilities |
|
Jet Finding, Event Shape routines |
|
Generator Framework, Diagnostic Event
Generator |
|
hep.io STDHEP, Root |
|
hep.aida Reference implementation of
AIDA classes |
|
Yappi XML Particle Property Database |
|
HEP3D Some Java 3D utilities, 3D
Plotting, Geant4 shapes |
|
Check it out: http://java.freehep.org |
Conclusions
|
|
|
|
|
Java IO for Root exists as part of the
FreeHEP Java library |
|
Currently suitable for many tasks |
|
Event Display, Object Browser,
Histogram Browser, Web access to histograms |
|
JAS plugin makes analysis of Root files
possible |
|
Suitable for evaluation and analysis of
small data samples. |
|
Needs high performance Root IO for
large data volumes |
|
Much higher performance version of Root
IO coming before end of year |
|
Want feedback on what features are most
needed to make this useful. |
Links
|
|
|
|
Root IO package (hep.io.root) |
|
http://java.freehep.org/lib/freehep/doc/root/ |
|
JAS |
|
http://jas.freehep.org/ |
|
http://java.freehep.org/lib/freehep/doc/root/rootjas.shtml |
|
FreeHEP |
|
|