OO Design and Implementation
Java and Java Analysis Studio

Overview
Lecture 1
Overview of Java language, the Java Virtual Machine and associated libraries
How does Java differ from C++
OO Techniques in Java (compared to C++)
Java Performance and Performance tips
Some Code examples
Not a Java Tutorial – intended to whet your appetite
Lecture 2
Java Analysis Studio (JAS) – emphasize how features of Java introduced in first lecture are used in JAS.

Lecture 1 – Part 1
Introduction to Java

Introduction to Java
History
Originally designed (1991) as a small language for consumer electronics (cable boxes, toasters etc.)
Eventually someone wrote HotJava Browser, which could run Java Applets
Adopted by Netscape, Microsoft, Sun etc. as Web Programming Language.
More than just a Web Tool
Java is a fully functional, platform independent, programming language
Powerful set of machine independent libraries, including windowing (GUI) library.

Java Today
Today Java is used for:
Business Applications
E-commerce and web server applications
Graphical and GUI applications
Scientific, Financial and Engineering Applications
Web Applets
Java is now the dominant language in e-commerce and Web Server Applications
Java servlets, Java Servlet Pages (JSP)
Enterprise Java Beans (EJB)
Demand for Java programmers now exceeds demand for C++ programmers

Major Features of Java

Java libraries
Java has a large set of standard libraries
These greatly aid program development and code reusability.
Often define interfaces for which multiple implementations exist
Standard Libraries (some examples)
Collections framework
AWT + SWING – cross-platform GUI
JDBC – interface to SQL databases
Sockets, URL/HTTP, Remote Method Invocation (RMI), CORBA – networking support
Java 2D and Java 3D – advanced graphics
Other libraries (some examples)
XML, ODMG (object databases), Numerical.

Java versus C++
Both OO languages
syntax very similar (identical where appropriate)
C++ goals:
OO
Backward Compatibility (with C)
Efficiency
User has control over memory allocation (stack vs. heap) and object deletion
Possible to make program efficient, but also possible to shoot oneself (and one’s collaborators) in the foot.
Java’s goals:
OO
Platform Independence
For example - int has 32 bits on all platforms
Reliability
Program takes care of memory allocation/deallocation
As many errors as possible detected at compile time
Extensive run-time checks (array bounds, casting)

Java vs. C++ - Specifics
No pointers(!)
 Instead Java uses object “references”
very similar to pointers except
no pointer arithmetic
no * or & or -> operator ( no ** either)
All Objects created with the new operator.
No explicit delete operator.
Objects are freed when they are no longer reachable (via a process known as garbage collection).
No global variables or functions
Everything must be part of a class.
No operator overloading.

Java vs.  C++
Java classes are defined in a single file
No separation between .cpp and .hh files
Java classes are grouped into packages
“package” protection in addition to public, private, protected
No pre-processor
No need for #include
No need for #ifdef WIN32 etc
No Templates
All classes extend Object
Collections can contain and Object
Templates may be added to future version of Java

A Sample Program

Lecture 1 - Part 2
OO Techniques in Java

OO Techniques in Java
The fundamentals of OO programming in Java are identical to C++
See Makoto’s talks at this School.
We will only discuss differences between Java and C++.
Java Interfaces and Adaptors
Inner Classes and Anonymous Inner Classes
Dynamic class loading
Reflection
Threads

Java Interfaces
In C++ there is one type of object
Declared with the class statement
In Java there are two types of objects
Interfaces
Only defines public methods – no implementation
Supports multiple inheritance
Classes
Very similar to C++ classes, may define member variables, methods and their implementations.
Only support single (not multiple inheritance) from other classes, may implement any number of interfaces.
Interfaces are equivalent to C++ “pure virtual functions”
Emphasis in Java reflects their importance in good OO design by separating interface and implementation

Simple Interface Example

More Interface Examples
Linear Collider Event Structure

Adaptors
Adaptors are an important design pattern
Adaptors map one interface to another
Without copying the data
Emphasized in Java by extensive use of interfaces.
Simple Example:
We want to feed a set of tracks to our SimpleJetFinder
The person who implemented the Track class
Didn’t attend the CSC
Didn’t make the Track class implement FourVector
Two ways to proceed
Loop over all the tracks, and create a SimpleFourVector for each one, then feed the SimpleFourVectors to the jet finder
Or …. Write an adaptor that converts the track to a FourVector

Adapter Example

Adapters vs. Converters
In this simple example there is not much difference between the two methods
In more realistic example Adapter has significant advantages
No copying of data
Copying large amounts of data can be time consuming
Especially if only a small subset is needed
Copy of data can get out of sync with original
Jet finder will give list of which particle was assigned to each jet
With method 1 we have lost the relationship between SimpleFourVector and Track
With method 2 each TrackAdapter knows which track it is associated with

Inner Classes
Java programs often have many small classes
For this reason use of Inner Classes (C++ calls then nested classes) is common in Java
Used when class is only used by container class
Inner Classes have access to
(private) Member variables of container class
(private) Methods of container class
Adapters are often implemented as Inner Classes

Inner Class Example

Anonymous Inner Class

Dynamic Code Loading
Java makes dynamic loading of code very easy.
Class c = Class.forName(“hep.atlas.geometry.Detector”);
Detector det = (Detector) c.newInstance();
Since dynamic class loading is so easy it can be used in a very fine grained way.
No complex build or configuration system needed
No need to load things that are not used.
You can define your own ClassLoader which:
Loads classes in an application specific way
From a Database, from a URL,
Can unload and reload classes
Defines a “namespaces” for classes
Defines a security environment for classes (c.f. applets)

Reflection
Similar to C++’s RTTI, but much more general.
At runtime can find out:
Names of classes methods, members, subclasses
Can fetch value of any member by name
Can call any constructor or method by name
Adding scripting to any Java program is easy
Many scripting languages
Beanshell, JPython, etc.
Full access to all java objects with no extra programming
Many uses in Data Analysis

Threads
Threads allow more than one “thread” of execution in a program at a time
Useful for GUI’s – when you want GUI to remain responsive to user input even when the program is busy doing something else
Handling slow IO – e.g. network IO
Threads are supported by most operating System’s today
Java Supports threads by:
Direct support for Threads in language
Library routines that support multi-threading
But not all, in particular swing, see documentation

Java Support for Threads
Java has built-in support for threads
Just derive a class from thread
Override the run() method to perform task
Issue the start() method on the object
Java has direct language support for Threads
synchronized keyword
Can be applied to classes or methods
Allows only one thread at a time to access class/method
wait() and notify() method in object
Allow for inter-thread signaling
Rules for using Threads
Don’t! (I’m serious!) – they are very hard to get right
If you must – read “book Concurrent Programming in Java” by Doug Lea

Thread Example

Lecture 1 - Part 3
Java Performance and Performance Tips

Java Performance
Performance of Java programs has improved by several orders of magnitude since Java’s introduction in 1994.
Many advanced techniques have been developed for optimizing Java programs
Sun, IBM, Microsoft have research teams working on this
In particular “Dynamic Optimization” has proved a very powerful technique.

Static Optimization
For most languages optimization is performed by the compiler as part of compilation
Advantages
Compilation time is not normally of great importance, so compiler can carefully analyze code and perform time-consuming optimization.
Well understood techniques developed over decades.
Disadvantages
Compiler only analyzes one subroutine/class at a time.
At compile time little is known about how the program/class will actually be used
C/C++ pointers make optimization more difficult.

Dynamic Optimization
Program is optimized while it is running
Technique used by all modern Java VM’s
Advantages
Code can be optimized for actual target platform
E.g i368, i486, Pentium, PII, PIII, single/multi processor
Only code that is actually a bottleneck needs to be optimized
Optimizer monitors program to find “Hot-Spots”
Information about how program is actually being used is available to the compiler.
Optimization can change as programs runs.
E.G. when new classes are loaded or usage pattern changes

Dynamic Optimization
Virtual method calls
In Java almost all methods can be overridden by subclasses
all methods are virtual unless declared final
Dynamic optimizer knows if any class actually overrides method, and can eliminate most virtual call overhead
Inlining of code
Inlining of code can be done across classes, libraries
Thread support
Dynamic optimizer knows if multiple threads are actually being used, and can eliminate synchronization overhead in single threaded apps.

Memory Allocation
Dynamic Optimizers also do better memory optimization
Two stage memory allocation allows very efficient allocation of short-lived objects
Objects are initially allocated in a “nursery”
Allocated in contiguous stack-like fashion – very fast
Most objects in the nursery have been discarded before the nursery becomes full
Only long lived objects need to be moved to long-term storage
Java objects can be moved in memory (unlike C++)
Objects with many cross-references can be moved closer together
Memory fragmentation can be eliminated

Performance Tips
Optimizer can’t do everything – you can help
Be aware of overhead in some library routines
Use newer “collections” classes rather than older (Vector, Hashtable) classes.
Use profiler to test your program’s performance
Don’t unnecessarily create objects
Allocate temporary objects outside loop and reuse them
StringBuffer is much more efficient than String for string manipulation
Use buffered IO routines for reading data
BufferedReader, BufferedInputStream
Use the java.nio package when it becomes available
Be wary of lists like this one
Many of them are outdated by newer Java VM’s

HEP Performance Tips
In HEP we often analyze events
Events can be large and involve many objects
We do not care about correlations between events
at the end of the event all of its objects can be discarded.
Allocating many objects and leaving the garbage collector to clean them up can be expensive.
Instead of using the new operator
Track myTrack  = new Track();
Use a Factory, e.g.
Track myTrack = TrackFactory.createTrack();
Implementation of factory can be initially trivial
Later on can be optimized
reuse tracks from previous event
Discard unused tracks only when memory becomes tight

Simple Track Factory

Smarter Track Factory

Smartest Track Factory

Test Program

Test Program

Test Program Results
Java Test Results

Performance Caveats
"More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity."
W.A. Wulf
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."
Donald Knuth

“Real Life” Performance

“Real Life” Performance

Conclusion
C++ forces user to take care of
Optimization hints (inline, virtual)
Memory allocation
Price to pay is:
crashes, dangling pointers, endless debugging
Memory leaks – objects that are never deleted
Corruption of clean OO style because of need for objects to co-operate on object ownership/destruction
Java VM’s do an increasingly good job of:
Memory allocation
Code optimization
Without programmer having to worry about it
In a totally safe (crash resistant) way

References
Java Tutorial
“The Java Tutorial” (online)
http://java.sun.com/docs/books/tutorial/
Performance
“The Java Performance Report”
http://www.javalobby.org/features/jpr/
“The Java HotSpotTM Performance Engine”
http://java.sun.com/products/hotspot/whitepaper.html
IBM Systems Journal – “Java Performance”
http://www.research.ibm.com/journal/sj39-1.html

OO Design and Implementation
Java and Java Analysis Studio

Overview
Lecture 2
Introduction to Java Analysis Studio (JAS)
Implementation of JAS
How JAS leverages the features of the Java language
Advanced Features of JAS
Using individual modules
The Plot widget
Java Servlets and JAS
Using JAS histograms from C++

Lecture 2 – Part 1
Introduction to JAS

JAS Overview
Modular Java Toolkit for Analysis of HEP data
Data Format Independent
Experiment Independent
Supports arbitrarily complex analysis modules written in Java
Rich Graphical User Interface (GUI) with:
Data Explorer
Flexible Histogram + Scatterplot display
Histogram manipulation+fitting
Built-in Editor/Compiler (for writing analysis modules)
User extensible via Object Orientated API's (“Plugins”)

JAS GUI

Histogram Viewer

Editor/Compiler

User Extensible

Data Format Independent
JAS does not have “JAS Files”
Supports any data format via Data Interface Modules, implemented by Plugins
Currently Support
PAW, Hippo, Flat-File, StdHep (MC generator output).
SQL database (using JDBC)
Objectivity HepTuples (thanks to CERN)
Experiment Specific Formats
LCD, SIO (LCD), Jazelle (SLD)
Experimenting with
Root – first release of DIM available
Future
CDF/HDF, AIDA (XML) format

Data Types
Supports n-tuples and/or Structured Data
n-tuples are fast and allow for many simplifications in GUI
 Simple Interactive cuts
 Simple plot generation
but n-tuples ultimately limiting
Arbitrarily Structured Data provides ultimate flexibility
Requires slightly more work from end-user
Complete Object Oriented Analysis Environment

Analysis Modules
User analysis modules written in Java
Java Excellent Language for Physics Analysis
Easy to learn yet very powerful, fully OO language
Fast (and getting faster)
Very fast code, test, fix cycle (dynamic loading)
JAS provides built-in editor, compiler, plus:
hep.analysis package
for creating/filling/manipulating histograms
hep.physics package
simple particle, track manipulation package

Example Analysis

Lecture 2 – Part 2
JAS Implementation

Implementation
JAS is written entirely in Java
Except for the interface to PAW, StdHEP
Uses the Java Swing toolkit for GUI
Works equally well on
Windows (95/98/NT/2000)
Linux, Solaris
Any machine with Java
Open Source
All code in CVS – see web site for details
Binaries and Source freely available
Highly Modularized
minimal interdependence between modules
Maximize reusability of modules

JAS Modules

Use Standard Libraries
Where ever possible use standard Java libraries
Swing for GUI
JDBC for database interface
Use freely available libraries
XML4J for XML parsing (switching to Xerces)
jEdit (GPLed Java editor) syntax colourizer
Where possible write reusable modules for HEP specific code
FreeHEP Java library
Developed in collaboration with WIRED, LCD, Atlas, Babar
Large collection of reusable components
Needed by HEP and not generally available
Specific to HEP

FreeHEP Library
Library of common HEP and non-HEP classes, provided by different projects (JAS, WIRED, LCD, Atlas …)
Contains:
Graphics2D: Pixel, PostScript, SVG and bitmap driver for 2D graphics.
XML: MenuBuilder
Util: CommandLine and CommandDispatcher
Swing extensions: SpinBox and TriStateBox
HEP IO: routines for Stdhep and Root

FreeHEP library
Soon:
Hep: 3/4 Vectors, Event Generators, Jet Finders
HepRep: Representations for HEP Event Display
Java3D extensions: HEP primitives – G4 Volumes
Swing extensions: Grid Desktop Manager
YaPPI: Particle Properties API
AIDA: Histogram Interfaces
JACO: Java Access to C++ Objects
Available from: java.freehep.org

Java Language Features
Dynamic Code Loading is used for
Loading/Unloading User Analysis Routines
Loading user extensions (Plugins -- see later)
Interfaces
Data Interface Module
By using simple interface it is easy to add support for new data formats, just need to provide
Interface for opening dataset (normally by filename)
Implementation of EventSource interface
Or subclass AnnotatedEventSource for n-tuples
DataSource interface
Used to provide interface to Plot Widget – see later

EventSource Interface

Lecture 2 – Part 3
Advanced Features

Remote Data Access
Rather than transporting peta-bytes of data to the physicist
Transport the physics analysis code to the data
Transparently - so that it feels just like local data access
Using Java-Agent Technology
Just ship histogram contents back to the physicists desktop

Distributed Computing

XML Support
XML specifies a generic syntax for a markup language but no tags
Users specify tags to use for a specific problem domain
HTML roughly an XML instance for web pages
Tag set formally specified by a DTD.
Have defined tag set for markup of plots
Tried to make it generic so it could be used by other programs
JAS directly supports reading/writing PlotML
PlotML file can store
Display style + snapshot of current data
Display style + reference to (live) data
PlotML is ASCII file (like HTML) so can be hand edited

HTML Support
Using Swing JEditorPane JAS now supports HTML display
Supports most features of HTML 3.2
Nested tables and frames a bit dodgy
"Live" objects can be embedded within HTML page
Built in objects such as plots
Used defined objects (sliders etc)
Multiple Objects on page can interact with each other
Useful for:
Tutorial information
Online monitoring
Presentations (perhaps?)

HTML Support

Java 3D
Andrey Kubarovsky and Joy Kyriakopulos at Fermilab
Using Java 3D API (standard Java extension)
Build lego plot, surface plot, 3D scatter plot
Built as standalone package
Designed to be compatible with the JASHist bean
Will fit into same model-view-controlled model

Java 3D

JAS Plugins
Standard Plugins
Paw
Stdhep
Hippo
FlatFile
Test
Fitting+Function
New Plugins
Root
Objectivity/HepTuple (Dino, Xavier)
Experiment Specific
LCD
LCD - Wired
Jazelle (SLD)
Under Development
Bean Shell
AIDA
Geant 4

BeanShell Scripting

Lecture 2 – Part 4
Using the JAS Plot Widget

JAS Plot Component
Can be used in other applications
Uses Model-View-Controller design

Adapter Example

Plot Component Features
1+2-D histograms and scatter plots
Scatter Plot display optimized for 1000’s of points
Overlaying of several histograms or scatter plots
Interactive function fitting for 1-D plots
Direct User Interaction by clicking and dragging
Numeric or time axes, plus axes with named bins
Many display styles that can be set interactively or programmatically
Dynamic creation and display of slices and projections of 2-D data.
Very efficient redrawing to support rapidly changing data (handles over 100 updates/second).
Printing using both Java 1 and Java 2 printing models. High quality print output is available when using Java 2.
Saving plots as GIF images or as XML. Support for encapsulated postscript and PDF is in progress.
Custom overlays which allow data to be displayed using user defined plot routines for specialized plots.

JASHist Example

JASHist Example

Servlet Support
JASHist can be used inside a Java Servlet
Servlet vaguely similar to Java Applet
Both could be used to but a "live" plot on a web page
Sevlet runs on web server and sends GIF to browser
No need for java support in browser
No worries about browser version/functionality
No slow download of Java code
Just requires implementation of DataSource
Easy to interface to many different data sources
Many examples at: http://jas.freehep.org/

Lecture 2 – Part 5
Using JAS Histograms from C++

AIDA

C++ à Java with AIDA

Prototype Implementation

JAS at the CSC
Multiple versions of Java installed
IBM JDK 1.1.8 (default)
Fast, well debugged and tested
 /usr/local/bin/java
IBM JDK 1.3
Most recent
Needed for printing, Root Plugin
A little buggy – doesn’t work with fvwm2
You must use Gnome or KDE as your window manager
 /opt/IBMJava2-13/bin/java
JAS is installed in:
Will use which ever java is in your PATH
 /CSC/apps/staff/tjohnson/JAS/jas

JAS Tutorials at the CSC
Official tutorial covers
Creating histograms in Geant4 (C++) and viewing them in JAS
Controlling Geant4 from JAS
JAS also contains a built-in tutorial
Click “tutorial” on the welcome page
Covers opening and analyzing a PAW n-tuple from JAS
I will try to put a few simple analysis examples in
/CSC/homes/staff/tjohnson/www
http://192.168.2.1/lecturers/tjohnson/
I am happy to answer questions on official/unofficial tutorials, Java, JAS etc.

Where to get more info.
JAS
http://jas.freehep.org/
Downloads, Source Code, Mailing List, Documentation, etc.
FreeHEP Java library
http://java.freehep.org/
AIDA
http://aida.freehep.org

Conclusion
Java is:
Modern!
Fun!
Productive!
Lucrative!
Warning!!!!
Going back to C++ after programming in Java is:
 Not Fun