Java Mini-Tutorial

Contents

Introduction

In this chapter we provide a brief overview of the Java Programming Language. The tutorial presented here is intended to introduce readers to just enough Java to write physics analysis routines for Java Analysis Studio, and is far from a complete Java tutorial (for more detailed tutorials see the references at the end of this chapter). No prior knowledge of Java or object-oriented programming is assumed. Readers already familiar with C or C++ should find the contents very familiar.

The Java programming language is an object-oriented language developed by Sun, and popularized by its introduction as a way of programming "web applets" in Netscape and other browsers. Java code is compiled into a machine-independent-pseudo-machine-code called bytecodes. By convention Java source code is kept in files with extension .java, and the bytecodes generated by compiling the source are kept in files with extension .class. Since Java bytecodes are machine independent they can be run on any machine, and can easily be moved from machine to machine over a network. Java Analysis Studio makes use of this feature to allow analysis code to be written and compiled on the desktop machine, and then either executed locally or moved to a data server to be executed. When Java bytecodes are executed on a particular machine they are normally converted to native machine code at runtime, a process known as Just-In-Time (JIT) compilation. Thus analysis code written in Java can attain execution speeds not too far removed from compiled code and considerably faster than interpretive languages previously used for physics analysis, such as COMIS and IDA. The compile/load speed of Java is very fast, so the turnaround time for modifying-compiling-reloading and running analysis code in Java is also very good.

Classes, Objects, Methods and Constructors

Since Java is a pure object-oriented language all code is written as classes. A class typically represents a set of concrete or abstract items, such as Histograms, Cuts, Particles, or Event Analyzers. Objects represent a specific instance of an item contained within the set, thus a specific histogram may be represented by an object of class Histogram. (By convention classes are given capitalized names, while objects are spelt with an initial lowercase letter). Procedures within classes are called methods, thus a Histogram class would typically contain methods for filling the histogram as well as methods for extracting information about the histogram. Classes normally have one or more constructors, which are methods used to create new objects of that class. Constructors always have the same name as the class itself, and unlike other methods they cannot return a value (since they implicitly return a new object).

To create a very simple Histogram class in Java one could write:

public class Histogram
{
   Histogram(String name)
   {
      m_name = name;
   }
   public String getName()
   {
      return m_name;
   }
   public void fill(double value, double weight)
   {
      // fill bins here
   }
   private String m_name;
}

This code declares a class called Histogram. The class is declared public meaning that anyone can access the Histogram class. The histogram class contains one constructor, which by convention has the same name as the name of the class. In this case the constructor takes a single argument of type String, which is stored into the variable m_name. Note that m_name is declared at the level of the class itself, rather than inside any of the methods, which indicates that the variable is a member variable. Member variables have the same life-span as the objects which contain them, thus whenever an object of class Histogram is created one string variable m_name will be created within that object, and the variable will maintain its value until that object is destroyed. m_name is declared private, meaning that it can only be accessed directly from within the Histogram class itself.

Our example Histogram class contains a second method, fill, which takes two arguments, the bin to fill (value) and the weight to add to that bin. For simplicity the body of the function is omitted (fortunately we don’t really need to write our own Histogram class since Java Analysis Studio already contains a fully functional histogram class which we can just use!).

Note that all statements in Java must end with a semi-colon, and that squiggly-brackets ({}) are used to delimit the beginning and end of blocks of code, such as the bodies of functions. Double slashes are used to begin single line comments. (Anyone familiar with C or C++ will notice that most Java conventions are exactly the same as for those languages.)

In Java, objects of a particular class are created by using the new keyword. Thus to create and fill a histogram in Java one would write:

   Histogram myhist = new Histogram("My Histogram");
   myhist.fill(99.0,1.0);

In the first line above a local variable myhist is declared, which is of type Histogram, and which is assigned a new histogram object (named "My Histogram"). In the second line the fill method of the histogram object is invoked. In Java it is not necessary (or possible) to explicitly delete an object, rather the object will be automatically destroyed once there are no active references to it (a process known as garbage collection).

Inheritance

One of the most powerful features of object-oriented languages is the concept of inheritance, whereby one class may inherit all or a subset of the methods and member variables of a super-class. For example we could implement a BetterHistogram class as follows:

public class BetterHistogram extends Histogram
{
   BetterHistogram(String name)
   {
      super(name);
   }
   public void fill(double value, double weight)
   {
      m_nEntries++;
      super.fill(value,weight);
   }
   public int getNEntries()
   {
      return m_nEntries;
   } 
   private int m_nEntries = 0;
}

In this example the class BetterHistogram is declared as extending Histogram, which means that in inherits all of the methods and member variables of the Histogram class, but can, in addition, add its own methods and member variables. All classes implicitly inherit from the base class Object(in the API reference documentation) which is built-in to the Java language.

Java classes can only extend one super-class (i.e. Java does not support multiple-inheritance) although they can implement any number of interfaces (see below).

When using Java Analysis Studio the most common way in which you will encounter inheritance is when you write your own event analysis routines. Java Analysis Studio contains a built-in class EventAnalyzer(in the API reference documentation), which can be thought of as an empty framework for performing event analysis. The class contains methods which are called to process each event (processEvent) and which are called at the beginning and end of each run (beforeFirstEvent and afterLastEvent) but each of these methods is empty, meaning that it does not actually perform any data analysis (or anything else). The purpose of a framework class such as EventAnalyser is to allow you to extend it, for example to provide a MyEventAnalysis class which actually does something useful (your physics analysis). For example:

public class MyAnalysis extends EventAnalyzer
{
   public void processEvent(EventData d)
   {  
      // perform analysis and fill histograms here.
   }
}

Classes vs. Interfaces

As mentioned above classes in Java can only extend a single super-class. However, in addition to classes Java supports a second concept called an interface. Like classes interfaces define a set of methods, but unlike classes they can not contain any implementation of these methods. (Interfaces in Java are very similar to pure-virtual classes in C++).

The following is an example of an interface.

public interface FourVector
{
   public double x();
   public double y();
   public double z();
   public double t();
   public double magnitude();
}

What the above means is that anything which calls itself an FourVector must provide implementations of each of the methods specified in the interface. A class can be declared to provide an implementation of an interface using the implements keyword. For example:

public class Particle implements FourVector
{
   private double px, py, pz, mass;
   public Particle(double px, double py, double pz, double mass)
   {
      this.px = px;
      this.py = py;
      this.pz = pz;
      this.mass = mass;
   }
   public double x() { return x; }
   public double y() { return y; }
   public double z() { return z; }
   public double t() { return mass; }
   public double magnitude() 
   {
      return Math.sqrt(x*x + y*y + z*z + mass*mass);
   }
}

A single class can provide an implementation of any number of interfaces. As far as the user of an interface is concerned, they work exactly the same as classes, so with the above definitions you can write:

FourVector v = new Particle(1,2,3,0.5);
double e = v.magnitude(); 

Packages

Classes in Java are normally defined inside packages. Packages have two functions

  1. They group sets of classes together. A package normally contains a group of classes that work together to achieve some well defined functionality.
  2. The define a hierarchical name-space, so that classes with the same name will not clash with each other so long as they are defined in different packages.

For example the full name of the String class is java.lang.String, indicating that it is in the java.lang package. By convention package names are always in lower case (a convention which is just about universally followed), and by convention should be named using the reversed domain name of the creating organization, for example edu.stanford.slac.jas.Histogram (a convention often ignored since it can lead to unwieldy package names - and not everyone has their own domain name). 

When defining a class you will normally start your .java file with a package statement:

package my.analysis;

This statement implies that any classes defined inside the file are considered to be in package my.analysis. If you do not put a package statement in your file your classes will be considered to be in the "unnamed package". In general the unnamed package is good for quick tests and experimenting, but for code that you expect to use longer term an explicit package statement is a good idea.

Wherever a class name appears you can use the full name including package name, however this leads to a lot of typing and can adversely effect the clarity of your code. As an alternative you can use import statements following the package statement at the top of your program. For example:

import java.lang.String;
import java.lang.*;

The first line imports a specific class, the second line imports all of the classes in package java.lang. Once you have imported a class you can refer to it by its short name (e.g. String). If you import two packages which contain a class with the same name you will still need to refer to it using its fully qualified name. Note that in reality you do not ever need to import package java.lang since it is unique in always being considered to be implicitly imported. The package statement, if it exists, must be the first statement in the file, and must be immediately followed by any import statements.

Java requires that a class whose full name is my.analysis.Histogram be defined in a file called Histogram.java which resides in a directory my/analysis. ie:

 

Protection - public, protected, private

Methods and member variables within Java classes can have access modifiers applied to them, that control where they can by used from. The allowed access modifiers are:

Variables

In Java there are only two types of variables, intrinsic and reference types. Intrinsic variables are those that refer to built-in simple types of variables, such as int, double, float, boolean. A complete list of built-in types is given in the following table:

Type Description
byte 8-bit signed integer.
short 16-bit signed integer.
int 32-bit signed integer.
long 64-bit signed integer.
float 32-bit IEEE754 floating-point.
double 64-bit IEEE754 floating-point.
char 16-bit Unicode character. Unicodes are extensions to ASCII to support international character sets. (Click here for information about Unicodes.)
boolean A true or false value, using the keywords true and false -- pretty clever. There is no conversion between booleans and other types, such as int's.

Note that Java completely defines the size and behavior of all built-in types, so they should behave identically on all platforms. When intrinsic variables are passed to functions they are always passed by value, thus the variable within the function is initially set to the value of the passed argument, but subsequent changes to the variable inside the function will have no effect on the value of the variable passed in.

The only other type of variable in Java is a reference to an object. References variables either always point to an object of a particular type, or have the special value null. Objects are only created if the new operator is explicitly used, the assignment operator just creates two references to the same object. Thus the statements:

Histogram a = new Histogram("my histogram");
Histogram b = a;

create one histogram object and sets variables a and b to point to the same histogram object. Therefore modifying the object pointed to by a will also modify the object pointed to by b (since they are the same object). This can be a little confusing until one gets used to it, for example:

Histogram a = new Histogram("my histogram");
Histogram b = a;
b.fill(1.0);
System.out.println("a has "+a.getNEntries()+" entries");

will print 1 not 0.

Operators and Expressions

Arithmetic operators

The arithmetic operators in Java are almost identical to those in C or C++. These arithmetic operators can be used on any integer or floating point operands. The operands will be automatically promoted as necessary (thus adding an int and a double will produce a double).

Operator Use Description
+ op1 + op2 Adds op1 and op2
- op1 - op2 Subtracts op2 from op1
* op1 * op2 Multiplies op1 by op2
/ op1 / op2 Divides op1 by op2
% op1 % op2 Computes the remainder of dividing op1 by op2
++ op++ Increments op by 1; evaluates to value before incrementing
++ ++op Increments op by 1; evaluates to value after incrementing
-- op-- Decrements op by 1; evaluates to value before decrementing
-- --op Decrements op by 1; evaluates to value after decrementing
+ +op Promotes op to int if it's a byte, short, or char
- -op Arithmetically negates op

Java does not contain any operator like for Fortran ** operator, you must use the java.lang.Math.pow(in the API reference documentation) method described under Mathematical Functions below.

Note that the + operator can also be used to concatenate Strings. Other than this one special case, arithmetic operators can only be used on the built-in Java type, thus even if you define your own Complex type you will not be able to use the + operator to add Complex objects together, since Java does not support operator overloading.

Relational and Conditional Operators

Relational operators can only be used on boolean operands. Unlike C, Java will not automatically convert integers to booleans.

Operator Use Return true if
> op1 > op2 op1 is greater than op2
>= op1 >= op2 op1 is greater than or equal to op2
< op1 < op2 op1 is less than op2
<= op1 <= op2 op1 is less than or equal to op2
== op1 == op2 op1 and op2 are equal
!= op1 != op2 op1 and op2 are not equal
&& op1 && op2 op1 and op2 are both true, conditionally evaluates op2
|| op1 || op2 either op1 or op2 is true, conditionally evaluates op2
! ! op op is false
& op1 & op2 op1 and op2 are both true, always evaluates op1 and op2
| op1 | op2 either op1 or op2 is true, always evaluates op1 and op2

One thing to be aware of is that the == operatator will only consider two references to be equal if they point to the same object, thus:

String a = new String("xyz");
String b = new String("xyz");
boolean result = (a == b);

will set result equals to false, even though both strings have the same contents. You should use the Object.equals(in the API reference documentation) method to compare string for equality:

String a = new String("xyz");
String b = new String("xyz");
boolean result = a.equals(b);

Java supports one other conditional operator--the ?: operator. This operator is a tertiary operator and is basically short-hand for an if-else statement:

boolean-expression ? op1 : op2

The ?: operator evaluates boolean-expression and returns op1 if it's true and op2 if it's false.

Bitwise Operators

Bitwise operators can be used on integer operands.

Operator Use Operation
>> op1 >> op2 shift bits of op1 right by distance op2
<< op1 << op2 shift bits of op1 left by distance op2
>>> op1 >>> op2 shift bits of op1 right by distance op2 (unsigned)
& op1 & op2 bitwise and
| op1 | op2 bitwise or
^ op1 ^ op2 bitwise xor
~ ~op2 bitwise complement

Assignment Operators

These assignment operators are just shorthand ways of performing common operations such as incrementing a variable by a given amount. They are normally clearer (and less prone to typos) that their longer counterparts.

Operator Use Equivalent to
+= op1 += op2 op1 = op1 + op2
-= op1 -= op2 op1 = op1 - op2
*= op1 *= op2 op1 = op1 * op2
/= op1 /= op2 op1 = op1 / op2
%= op1 %= op2 op1 = op1 % op2
&= op1 &= op2 op1 = op1 & op2
|= op1 |= op2 op1 = op1 | op2
^= op1 ^= op2 op1 = op1 ^ op2
<<= op1 <<= op2 op1 = op1 << op2
>>= op1 >>= op2 op1 = op1 >> op2
>>>= op1 >>>= op2 op1 = op1 >>> op2

Conditional and Loop Statements

As you have already seen Java statements all end with ; and multiple statements may be grouped together into a block using curly braces {}.   In addition Java supports all of the loop and conditional statements of the C language (although long-term Fortran users may be dismayed by the lack of a goto statement).

Statement Keyword
decision making if-else, switch-case
loop for, while, do-while
miscellaneous break, continue, label: , return

The usage of these statements is fairly self-explanatory, as the examples below will hopefully demonstrate.

if-else statement

int testscore;
char grade;

if (testscore >= 90) {
    grade = 'A';
} else if (testscore >= 80) {
    grade = 'B';
} else if (testscore >= 70) {
    grade = 'C';
} else if (testscore >= 60) {
    grade = 'D';
} else {
    grade = 'F';
}

switch Statement

int month;
. . .
switch (month) {
case 1:  System.out.println("January"); break;
case 2:  System.out.println("February"); break;
case 3:  System.out.println("March"); break;
case 4:  System.out.println("April"); break;
case 5:  System.out.println("May"); break;
case 6:  System.out.println("June"); break;
case 7:  System.out.println("July"); break;
case 8:  System.out.println("August"); break;
case 9:  System.out.println("September"); break;
case 10: System.out.println("October"); break;
case 11: System.out.println("November"); break;
case 12: System.out.println("December"); break;
default: System.out.println("Huh?????"); break;
}

The switch statement inherits C's behavior of "falling through" from one case to the following case unless an explicit break statement is inserted after each case as in the above example. Note also the use of the default statement to catch otherwise unmet cases.

while and do while statements

There are two forms of the while loop, one which tests the condition at the top of the loop, and one which tests it at the end of the loop (and hence always executes the loop body at least once).

int i = 0;
while (i<100)
{
   i++;
}
int i=0;
do
{
   i++;
} while (i<100);

for loop

The for loop perhaps requires some explanation for those not familiar with C. The for statement contains three clauses, separated by semi-colons. The first clause is executed once at the beginning of the loop, the second clause is executed before each iteration of the loop, and the third clause is executed at the end of each iteration of the loop. Any of the clauses can be omitted (although the semi-colons are still required). The first clause may contain a variable declaration, in which case the variable is only accessible from within the body of the for loop. The second clause, if present, must evaluate to a logical expression, and if false the loop will be exited.

for (int i=0; i<100; i++)
{
   System.out.println(i);
}

Miscellaneous statements

All loop constructs may contain a continue statement, meaning that execution should immediately skip to the next iteration of the loop, or the break statement, meaning that the loop should be immediately terminated and execution continued from after the loop. Continue and break statements normally operate on the innermost loop, although this can be modified by explicitly labeling the loop, and using a break or continue statement with a label.

Finally the return statement can be used to return from a method call. If the method's return type is anything but void the return statement must specify a return value.

Mathematical Functions

Java contains many common mathematical functions as part of the java.lang.Math(in the API reference documentation) built in class. Unfortunately you must always prefix these methods with the class name (Math), making complicated expressions a bit unwieldy.

The Math class contains two useful constants, Math.E(in the API reference documentation) and Math.PI(in the API reference documentation), as well as many methods including, Math.pow(double,double)(in the API reference documentation) (raise to power), Math.sqrt(double)(in the API reference documentation), Math.log(double)(in the API reference documentation) (natural log) and trigonometric functions Math.sin(double)(in the API reference documentation), Math.cos(double)(in the API reference documentation) etc.

The Math class also contains a simple random number generator, Math.random((in the API reference documentation)) which returns a random number in the range 0 to 1. For a more complete random number generator, which also allows setting and retrieving seeds and generating normally distributed random numbers, see the class java.util.Random(in the API reference documentation).

Example:

double r = Math.random();
double phi = Math.random()*Math.PI*2;
double x = r*Math.sin(phi);
double y = Math.sqrt(r*r - x*x);

Strings

A sequence of character data is called a string and is implemented in the Java environment by the String(in the API reference documentation) class. The Java language contains a few special shortcuts for handling Strings, for example any occurence of a quoted string constant will automatically be converted to a String, and the concatenation operator (+) can be used to concatenate two String together to produce a new String. Finally the concatenation operator (+) can be used to concatentate a String with any other object, in which case the object is first converted to a String (using the Object's toString(in the API reference documentation) method).

String world = "World";
System.out.println("Hello "+world);
System.out.println("The time is now "+new Date());  

String objects are immutable--that is, they cannot be changed once they've been created. Java provides a different class, StringBuffer(in the API reference documentation), which you can use to create and manipulate character data on the fly.

Arrays

Arrays in java are handled by array objects. In common with other objects they are created using the new operator, although the syntax is slightly modified. The statement:

int[] arrayOfInts = new int[100];

creates an array containing 100 ints, and assigns a reference to the array to the variable arrayOfInts. As in C and C++, array elements are numbered from 0, and are accessed as follows:

for (int i=0; i<arrayOfInts.length; i++) 
   arrayOfInts[i] = 0; 

The member variable length can be used to access the dimension of an array. As well as arrays of all the built-in types, Java also allows arrays of reference types, such as:

String[] arrayOfStrings = new String[10];
for (int i = 0; i < arrayOfStrings.length; i++) {
    arrayOfStrings[i] = new String("Hello " + i);
}

Like C and C++ Java does not directly support multi-dimensional arrays, but it does support arrays of arrays which give much the same functionality:

double[][] arrayOfArrayOfDoubles = new double[10][3];
for (int i=0; i<10; i++)
   for (int j=0; j<3; j++)
      arrayOfArrayOfDoubles[i][j] = 0;

Exceptions

The Java language has built-in support for handling errors, using a mechanism known as exception handling. To generate an exception in your code use the throw statement. For example:

if (x < 0) throw new IllegalArgumentException("x must be >= 0");

In Java exception are represented by instances of classes which extend Throwable(in the API reference documentation). Exceptions fall into two categories, checked exceptions and unchecked exceptions. If a method throws a checked exception it must explicitly declare that the exception can be thrown, using a throws clause. Declaring unchecked exceptions using a throws clause is optional. For example:

public double MySqrt(double x) throws IllegalArgumentException
{
    if (x < 0) throw new IllegalArgumentException("x must be >= 0");
    return Math.sqrt(x);
}

(Note that the Math.sqrt() method does not throw an exception when given a negative number, instead it returns a special double value, Double.NaN, which represents an undefined number. This is the normal behavior for floating point operations in Java).

Unchecked exceptions are those that extend either Error(in the API reference documentation) or RuntimeException(in the API reference documentation). All other exceptions are checked. In general checked exceptions are used for errors that could have been expected to happen in a well defined place (for example IO errors when reading a file), whereas unchecked exception are used for errors that could happen almost anywhere (for example running out of memory). These definitions are however rather vague, so it is often a matter of taste and style whether to use a checked or unchecked exception.

You can deal with exceptions in your programs using a try ... catch statement. For example:

try
{
   for (int i=0; i<errors.length; i++)
   {
      errors[i] = MySqrt(errors[i]);
   }
}
catch (IllegalArgumentException x)
{
   System.err.println("Error calculating errors");
   x.printStackTrace();
}

If a call to MySqrt results in an exception being throw, the loop will immediately be terminated and the body of the catch clause executed. If an exception is thrown inside a routine and is not caught using a try ... catch statement it is "bubbled up" to the caller of that method, and the caller of the caller etc., until either a catch clause is found, or the top level routine is reached in which case the exception is reported by Java, and the program terminated.

More Information

  1. Sun's Java Tutorial provides a much more complete (and generally better), as well as considerably longer, introduction to Java.

Last Modified: January 14, 2004