Java 101: Trash talk, Part 2

how-to
Jan 04, 200224 mins
Core JavaDevelopment ToolsJava SE

The Reference Objects API allows programs to interact with the garbage collector

Java’s garbage collection features tend to confuse new developers. I wrote this two-part series on garbage collection to dispel that confusion. Part 1 introduced you to garbage collection, explored various garbage collection algorithms, showed you how to request that Java run the garbage collector, explained the purpose behind finalization, and mentioned resurrection — a technique for bringing objects back from the dead. Part 2 explores the Reference Objects API.

Read the whole series on garbage collection:

As you learned in Part 1, Java’s garbage collector destroys objects. Although you typically write programs that ignore the garbage collector, situations arise in which a program needs to interact with the garbage collector.

For example, suppose you plan to write a Java-based Web browser program similar to Netscape Navigator or Internet Explorer. When it comes to displaying Webpage images, your first thought is for the browser to always download all images before displaying them to a user. However, you soon realize that the user will spend too much time waiting for images to download. Although a user might be willing to wait when visiting a Webpage for the first time, the user would probably not tolerate waiting each time he revisits the Webpage. To decrease the user’s wait time, you can design the browser program to support an image cache, which allows the browser to save each image after the download completes in an object on the object heap. The next time the user visits the Webpage in the same browsing session, the browser can retrieve the corresponding image objects from the object heap and quickly display those images to the user.

Note
To keep the discussion simple, I don’t discuss a second-level disk-based image cache mechanism. Browsers like Netscape Navigator and Internet Explorer use this mechanism.

The image cache idea features a problem — insufficient heap memory. When the user visits numerous Webpages with different sized images, the browser must store all images in heap memory. At some point, the heap memory will decrease to a level where no more room exists for images. What does the browser do? By taking advantage of the Reference Objects API, the browser allows the garbage collector to remove images when the JVM needs additional heap space. In turn, when the browser needs to redraw an image, the garbage collector tells the browser if that image is no longer in memory. If the image is not in memory, the browser must first reload that image. The browser can then restore that image to the image cache — although the garbage collector might need to remove another image from the cache to make heap memory available for the original image, assuming the object heap’s free memory is low.

In addition to teaching you how to use the Reference Objects API to manage an image cache, this article teaches you how to use that API to obtain notification when significant objects are no longer strongly reachable and perform post-finalization cleanup. But first, we must investigate object states and the Reference Objects API class hierarchy.

Object states and the Reference Objects API class hierarchy

Prior to the release of Java 2 Platform, Standard Edition (J2SE) 1.2, an object could be in only one of three states: reachable, resurrectable, or unreachable:

  • An object is reachable if the garbage collector can trace a path from a root-set variable to that object. When the JVM creates an object, that object stays initially reachable as long as a program maintains at least one reference to the object. Assigning null to an object reference variable reduces the object’s references by one. For example:

    Employee e = new Employee (); Employee e2 = e; e = null;
    

    In the above code fragment, the Employee object is initially reachable through e. Then it is reachable through e2 as well as through e. After null assigns to e, the object is only reachable through e2.

  • An object is resurrectable if it is currently unreachable through root-set variables, but has the potential to be made reachable through a garbage collector call to that object’s overridden finalize() method. Because finalize()‘s code can make the object reachable, the garbage collector must retrace all paths from root-set variables in an attempt to locate the object after finalize() returns. If the garbage collector cannot find a path to the object, it makes the object unreachable. If a path does exist, the garbage collector makes the object reachable. If the object is made reachable, the garbage collector will not run its finalize() method a second time when no more references to that object exist. Instead, the garbage collector makes that object unreachable.
  • An object is unreachable when no path from root-set variables to that object exists and when the garbage collector cannot call that object’s finalize() method. The garbage collector is free to reclaim the object’s memory from the heap.

With the release of J2SE 1.2, three new object states representing progressively weaker forms of reachability became available to Java: softly reachable, weakly reachable, and phantomly reachable. Subsequent sections explore each of those states.

Note
Also with the J2SE 1.2 release, the state previously known as reachable became known as strongly reachable. For example, in code fragment Employee e = new Employee ();, the Employee object reference in root-set variable e (assuming e is a local variable) is strongly reachable through e.

The new object states became available to Java through reference objects. A reference object encapsulates a reference to another object, a referent. Furthermore, the reference object is a class instance that subclasses the abstract Reference class in the Reference Objects API — a class collection in package java.lang.ref. Figure 1 presents a hierarchy of reference object classes that constitute much of the Reference Objects API.

Figure 1. A hierarchy of reference object classes composes much of the Reference Objects API

Figure 1’s class hierarchy shows a class named Reference at the top and SoftReference, WeakReference, and PhantomReference classes below. The abstract Reference class defines those operations common to the other three classes. Those operations include:

  • Clear the current reference object
  • Add the current reference object to the currently registered reference queue
  • Return the current reference object’s referent
  • Determine if the garbage collector has placed the current reference object on a reference queue

The aforementioned operations introduce a reference queue. What are reference queues, and why are they part of the Reference Objects API? I’ll answer both questions during our exploration of soft references.

Soft references

The softly reachable state manifests itself in Java through the SoftReference class. When you initialize a SoftReference object, you store a reference to a referent in that object. The object contains a soft reference to the referent, and the referent is softly reachable if there are no other references, apart from soft references, to that referent. If heap memory is running low, the garbage collector can find the oldest softly reachable objects and clear their soft references — by calling SoftReference‘s inherited clear() method. Assuming there are no other references to those referents, the referents enter the resurrectable state (if they contain overridden finalize() methods) or the unreachable state (if they lack overridden finalize() methods). Assuming the referents enter the resurrectable state, the garbage collector calls their finalize() methods. If those methods do not make the referents reachable, the referents become unreachable. The garbage collector can then reclaim their memory.

To create a SoftReference object, pass a reference to a referent in one of two constructors. For example, the following code fragment uses the SoftReference(Object referent) constructor to create a SoftReference object, which encapsulates an Employee referent:

SoftReference sr = new SoftReference (new Employee ());

Figure 2 shows the resulting object structure.

Figure 2. A SoftReference object and its Employee referent

According to Figure 2, the SoftReference object is strongly reachable through root-set variable sr. Also, the Employee object is softly reachable from the soft reference field inside SoftReference.

You often use soft references to implement image and other memory-sensitive caches. You can create an image cache by using the SoftReference and java.awt.Image classes. Image subclass objects allow images to load into memory. As you probably know, images can consume lots of memory, especially if they have large horizontal and vertical pixel dimensions and many colors. If you kept all such images in memory, the object heap would quickly fill up, and your program would grind to a halt. However, if you maintain soft references to Image subclass objects, your program can arrange for the garbage collector to notify you when it clears an Image subclass object’s soft reference and moves it to the resurrectable state — assuming no other references to Image exist. Eventually, assuming the Image subclass object lacks a finalize() method with code that resurrects the image, Image will transition to the unreachable state, and the garbage collector will reclaim its memory.

By calling SoftReference‘s inherited get() method, you can determine if an Image subclass object is still softly referenced or if the garbage collector has cleared that reference. get() returns null when the soft reference clears. Given the preceding knowledge, the following code fragment shows how to implement an image cache for a single Image subclass object:

SoftReference sr = null;
// ... Sometime later in a drawing method.
Image im = (sr == null) ? null : (Image) (sr.get());
if (im == null) 
{
    im = getImage (getCodeBase(), "truck1.gif");
    sr = new SoftReference (im);
}
// Draw the image.
// Later, clear the strong reference to the Image subclass object.
// That is done, so -- assuming no other strong reference exists -- 
// the only reference to the Image subclass object is a soft 
// reference. Eventually, when the garbage collector notes that it 
// is running out of heap memory, it can clear that soft reference 
// (and eventually remove the object).
im = null;

The code fragment’s caching mechanism works as follows: To begin, there is no SoftReference object. As a result, null assigns to im. Because im contains null, control passes to the getImage() method, which loads truck1.gif. Next, the code creates a SoftReference object. As a result, there is both a strong reference (via im) and a soft reference (via sr) to the Image subclass object. After the code draws the image, null assigns to im. Now there is only a single soft reference to Image. If the garbage collector notices that free memory is low, it can clear Image‘s soft reference in the SoftReference object that sr strongly references.

Suppose the garbage collector clears the soft reference. The next time the code fragment must draw the image, it discovers that sr lacks null and calls sr.get () to retrieve a strong reference to Image — the referent. Assuming the soft reference is now null, get() returns null, and null assigns to im. We can now reload the image by calling getImage() — a relatively slow process. However, if the garbage collector did not clear the soft reference, sr.get() would return a reference to the Image subclass object. Then we could immediately draw that image without first loading it. And that is how a soft reference allows us to cache an image.

The previous code fragment called sr.get() to learn whether or not the garbage collector cleared the sr-referenced object’s internal soft reference to an Image subclass object. However, a program can also request notification by using a reference queue — a data structure that holds references to Reference subclass objects. Under garbage collector (or even program) control, Reference subclass object references arrive at the end of the reference queue and exit from that queue’s front. As a reference exits from the front, the following reference moves to the front, and other references move forward. Think of a reference queue as a line of people waiting to see a bank teller.

To use a reference queue, a program first creates an object from the ReferenceQueue class (located in the java.lang.ref package). The program then calls the SoftReference(Object referent, ReferenceQueue q) constructor to associate a SoftReference object with the ReferenceQueue object that q references, as the following code fragment demonstrates:

ReferenceQueue q = new ReferenceQueue ();
SoftReference sr = new SoftReference (new Employee (), q);

After the garbage collector clears the referent’s soft reference, it appends SoftReference‘s strong reference to the reference queue that q references. The addition of SoftReference happens when the garbage collector calls Reference‘s enqueue() method (behind the scenes). Your program can either poll the reference queue, by calling ReferenceQueue‘s poll() method, or block, by calling ReferenceQueue‘s no-argument remove() method, until the reference arrives on the queue. At that point, your program can either stop polling or automatically unblock. Because the poll() and remove() methods return a SoftReference object reference (through the Reference return types), you discover which soft reference cleared; you can then modify your cache as appropriate.

To demonstrate reference queues and soft references, I have created a SoftReferenceDemo application, which chooses to call poll() instead of remove() to increase the likelihood of garbage collection. If the application was to call remove(), the garbage collector might not run — because the application doesn’t constantly create objects and nullify their references. Hence, the application would remain blocked. Examine SoftReferenceDemo‘s source code in Listing 1:

Listing 1. SoftReferenceDemo.java

// SoftReferenceDemo.java
import java.lang.ref.*;
class Employee
{
   private String name;
   Employee (String name)
   {
      this.name = name;
   }
   public String toString ()
   {
      return name;
   }
}
class SoftReferenceDemo
{
   public static void main (String [] args)
   {
      // Create two Employee objects that are strongly reachable from e1
      // and e2.
      Employee e1 = new Employee ("John Doe");
      Employee e2 = new Employee ("Jane Doe");
      // Create a ReferenceQueue object that is strongly reachable from q.
      ReferenceQueue q = new ReferenceQueue ();
      // Create a SoftReference array with room for two references to
      // SoftReference objects. The array is strongly reachable from sr.
      SoftReference [] sr = new SoftReference [2];
      // Assign a SoftReference object to each array element. That object
      // is strongly reachable from that element. Each SoftReference object
      // encapsulates an Employee object that is referenced by e1 or e2 (so
      // the Employee object is softly reachable from the SoftReference
      // object), and associates the ReferenceQueue object, referenced by
      // q, with the SoftReference object.
      sr [0] = new SoftReference (e1, q);
      sr [1] = new SoftReference (e2, q);
      // Remove the only strong references to the Employee objects.
      e1 = null;
      e2 = null;
      // Poll reference queue until SoftReference object arrives.
      Reference r;
      while ((r = q.poll ()) == null)
      {
         System.out.println ("Polling reference queue");
         // Suggest that the garbage collector should run.
         System.gc ();
      }
      // Identify the SoftReference object whose soft reference was
      // cleared, and print an appropriate message.
      if (r == sr [0])
          System.out.println ("John Doe Employee object's soft reference " +
                              "cleared");
      else
          System.out.println ("Jane Doe Employee object's soft reference " +
                              "cleared");
      // Attempt to retrieve a reference to the Employee object.
 
      Employee e = (Employee) r.get ();
      // e will always be null because soft references are cleared before
      // references to their containing SoftReference objects are queued
      // onto a reference queue.
      if (e != null)
          System.out.println (e.toString ());
   }
}

When run, SoftReferenceDemo might poll the reference queue for a short time or a long time. The following output shows one SoftReferenceDemo invocation:

Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Polling reference queue
Jane Doe Employee object's soft reference cleared

SoftReferenceDemo calls System.gc (); (in a loop) to encourage the garbage collector to run. Eventually, at least on my platform, the garbage collector runs, and at least one soft reference clears. After that soft reference clears, a reference to the soft reference’s enclosing SoftReference object appends to the reference queue. By comparing the returned reference with each sr array element reference, the program code can determine which Employee referent had its soft reference cleared. We can now recreate the Employee object, if so desired.

Weak references

The weakly reachable state manifests itself in Java through the WeakReference class. When you initialize a WeakReference object, you store a referent’s reference in that object. The object contains a weak reference to the referent, and the referent is weakly reachable if there are no other references, apart from weak references, to that referent. If heap memory is running low, the garbage collector locates weakly reachable objects and clears their weak references — by calling WeakReference‘s inherited clear() method. Assuming no other references point to those referents, the referents enter either the resurrectable state or the unreachable state. Assuming the referents enter the resurrectable state, the garbage collector calls their finalize() methods. If those methods do not make the referents reachable, the referents become unreachable, and the garbage collector can reclaim their memory.

Note
The primary difference between soft references and weak references is that the garbage collector might clear a soft reference but always clears a weak reference.

To create a WeakReference object, pass a reference to a referent in one of two constructors. For example, the following code fragment creates a ReferenceQueue object and uses the WeakReference(Object referent, ReferenceQueue q) constructor to create a WeakReference object (that encapsulates a Vehicle referent) and associate WeakReference with the reference queue:

ReferenceQueue q = new ReferenceQueue ();
WeakReference wr = new WeakReference (new Vehicle (), q);

Use weak references to obtain notification when significant objects are no longer strongly reachable. For example, suppose you create an application that simulates a company. That application periodically creates Employee objects. For each object, the application also creates an employee-specific Benefits object. During the simulation, employees eventually retire and their Employee objects are made eligible for garbage collection. Because it would prove detrimental for a Benefits object to hang around after its associated Employee object is garbage collected, the program should notice when Employee is no longer strongly reachable.

To obtain that notification, your code first creates WeakReference and ReferenceQueue objects. The code then passes Employee and ReferenceQueue object references to the WeakReference constructor. Next, the program stores the WeakReference object and a newly created Benefits object in a hash table data structure. That data structure forms an association between the WeakReference object (known as the key, because it identifies the hash table entry) and the Benefits object (known as that entry’s value). The program can enter a polling loop where it keeps checking the reference queue to find out when the garbage collector clears the weak reference to the Employee object. Once that happens, the garbage collector places a reference to the WeakReference object on the reference queue. The program can then use that reference to remove the WeakReference/Benefits entry from the hash table. To see how the program accomplishes that task, check out Listing 2; it creates a String object as the key and an Object object as the value:

Listing 2. WeakReferenceDemo.java

// WeakReferenceDemo.java
import java.lang.ref.*;
import java.util.*;
class WeakReferenceDemo
{
   public static void main (String [] args)
   {
      // Create a String object that is strongly reachable from key.
      String key = new String ("key");
      /*
         Note: For this program, you cannot say String key = "key";. You
               cannot do that because (by itself), "key" is strongly 
               referenced from an internal constant pool data structure
               (that I will discuss in a future article). There is no
               way for the program to nullify that strong reference. As
               a result, that object will never be garbage collected,
               and the polling loop will be infinite.
      */
      // Create a ReferenceQueue object that is strongly reachable from q.
      ReferenceQueue q = new ReferenceQueue ();
      // Create a WeakReference object that is strongly reachable from wr.
      // The WeakReference object encapsulates the String object that is
      // referenced by key (so the String object is weakly-reachable from
      // the WeakReference object), and associates the ReferenceQueue
      // object, referenced by q, with the WeakReference object.
      WeakReference wr = new WeakReference (key, q);
      // Create an Object object that is strongly reachable from value.
      Object value = new Object ();
      // Create a Hashtable object that is strongly reachable from ht.
      Hashtable ht = new Hashtable ();
      // Place the WeakReference and Object objects in the hash table.
      ht.put (wr, value);
      // Remove the only strong reference to the String object.
      key = null;
      // Poll reference queue until WeakReference object arrives.
      Reference r;
      while ((r = q.poll ()) == null)
      {
         System.out.println ("Polling reference queue");
         // Suggest that the garbage collector should run.
         System.gc ();
      }
      // Using strong reference to the Reference object, remove the entry
      // from the Hashtable where the WeakReference object serves as that
      // entry's key.
      value = ht.remove (r);
      // Remove the strong reference to the Object object, so that object
      // is eligible for garbage collection. Although not necessary in this
      // program, because we are about to exit, imagine a continuously-
      // running program and that this code is in some kind of long-lasting
      // loop.
      value = null;
   }
}

Unlike SoftReferenceDemo, WeakReferenceDemo doesn’t spend much time polling. After one call to System.gc ();, the garbage collector clears the weak reference to the String referent in the WeakReference object that wr references. Basically, WeakReferenceDemo works as follows:

  1. WeakReferenceDemo creates a String object that serves as a key. You must create that String object with a String constructor instead of simply assigning a String literal to key. The garbage collector will probably never collect String objects that you — apparently — create by assigning string literals to String reference variables. (A future article exploring strings will explain why.)
  2. Following a String‘s creation, WeakReferenceDemo creates a ReferenceQueue object and a WeakReference object that encapsulates String, which becomes the referent. We now have strong (via key) and weak (via the weak reference in the WeakReference object that wr references) references to String.
  3. WeakReference creates an Object value object that eventually associates with the String key object. A Hashtable object is created, and the table stores an entry consisting of the objects WeakReference and Object. (You will learn about the Hashtable data structure class in a future article.)
  4. WeakReference nullifies the String object’s key reference, so the only String reference is the weak reference inside the WeakReference object.
  5. A polling loop allows us to wait for the garbage collector to clear the weak reference and place a strong reference to the WeakReference object on the reference queue. On my platform, only one System.gc (); call is necessary to request the garbage collector to run. The first time it runs, all weak references clear, and the next call to q.poll () returns a reference to WeakReference.
  6. After the loop, you can simply remove the WeakReference/Object entry from the hash table and nullify the Object‘s reference in value.
Note
To save yourself the trouble of automatically removing entries from a hash table, Java supplies the WeakHashMap class. Investigate that class in the SDK documentation and rewrite WeakReferenceDemo to use that class.

Phantom references

The phantomly reachable state manifests itself in Java through the PhantomReference class. When you initialize a PhantomReference object, you store a referent’s reference in that object. The object contains a phantom reference to the referent, and the referent is phantomly reachable if there are no other references, apart from phantom references, to that referent. A phantomly reachable referent’s finalize() method (assuming that method exists) has already run and the garbage collector is about to reclaim the referent’s memory. A program receives notification, by way of a reference queue, when the referent becomes phantomly reachable, and subsequently performs post-finalization cleanup tasks that relate to the referent but do not involve the referent — because there is no way for the program to access the referent.

Unlike SoftReference and WeakReference objects, you must create PhantomReference objects with a reference queue. When the garbage collector discovers that a PhantomReference object’s referent is phantomly reachable, it appends a PhantomReference reference to the associated reference queue — by calling Reference‘s enqueue() method. The following code fragment demonstrates the creation of a PhantomReference containing a String referent:

ReferenceQueue q = new ReferenceQueue ();
PhantomReference pr = new PhantomReference (new String ("Test"), q);

There is a second difference between SoftReference, WeakReference, and PhantomReference objects. If either a SoftReference or a WeakReference object has an associated reference queue, the garbage collector places a reference to that object on the reference queue sometime after it clears the referent’s soft or weak reference. In contrast, the garbage collector places a reference to a PhantomReference object onto the queue before the phantom reference clears. Also, the program, not the garbage collector, clears the phantom reference by calling PhantomReference‘s clear() method, inherited from the Reference class. Until the phantom reference clears, the garbage collector does not reclaim the referent. Once the phantom reference clears, however, the referent moves from the phantomly reachable state to the unreachable state. Game over for the referent!

Note
By now, you know a call to get() on a SoftReference or WeakReference object, whose reference returns from a call to poll() or remove(), cannot return a referent’s reference. That’s not possible because the garbage collector clears the referent’s soft or weak reference before placing the SoftReference or WeakReference object’s reference on the queue. However, because the garbage collector does not clear the referent’s phantom reference, you would expect get() to return a reference to that referent. Instead, get() returns null to prevent the referent from being resurrected.

Listing 3 demonstrates phantom references:

Listing 3. PhantomReferenceDemo.java

// PhantomReferenceDemo.java
import java.lang.ref.*;
class Employee
{
   private String name;
   Employee (String name)
   {
      this.name = name;
   }
   public void finalize () throws Throwable
   {
      System.out.println ("finalizing " + name);
      super.finalize ();
   }
}
class PhantomReferenceDemo
{
   public static void main (String [] args)
   {
      // Create an Employee object that is strongly reachable from e.
      Employee e = new Employee ("John Doe");
      // Create a ReferenceQueue object that is strongly reachable from q.
      ReferenceQueue q = new ReferenceQueue ();
      // Create a PhantomReference object that is strongly reachable from
      // pr. The PhantomReference object encapsulates the Employee object
      // (so the Employee object is phantomly reachable from the
      // PhantomReference object), and associates the ReferenceQueue object,
      // referenced by q, with the PhantomReference object.
      PhantomReference pr = new PhantomReference (e, q);
      // Remove the only strong reference to the Employee object.
      e = null;
      // Poll reference queue until PhantomReference object arrives.
      Reference r;
      while ((r = q.poll ()) == null)
      {
         System.out.println ("Polling reference queue");
         // Suggest that the garbage collector should run.
         System.gc ();
      }
      System.out.println ("Employee referent in phantom-reachable state.");
      // Clear the PhantomReference object's phantom reference, so that
      // the Employee referent enters the unreachable state.
      pr.clear ();
      // Clear the strong reference to the PhantomReference object, so the
      // PhantomReference object is eligible for garbage collection. (The
      // same could be done for the ReferenceQueue and Reference objects --
      // referenced by q and r, respectively.) Although not necessary in
      // this trivial program, you might consider doing such clearing in a
      // long-running loop, so that objects not needed can be collected.
      pr = null;
   }
}

When run, PhantomReferenceDemo produces output similar to the following:

Polling reference queue
finalizing John Doe
Polling reference queue
Employee referent in phantom-reachable state.

The garbage collector has not yet run when the first Polling reference queue message appears. The first call to System.gc (); causes the JVM to try to run the garbage collector. It runs and executes Employee‘s finalize() method, which prints finalizing John Doe. The second Polling reference queue message indicates that a second call is made to System.gc ();. That call causes the garbage collector to move the Employee referent from the resurrectable state to the phantomly reachable state.

A close look at Listing 3 shows a pr.clear (); method call. That method call clears the phantom reference to the Employee referent in the PhantomReference object. That referent now enters the unreachable state, and the garbage collector can reclaim its memory the next time it runs.

Review

The Reference Objects API gives your programs limited interaction with the garbage collector through the SoftReference, WeakReference, and PhantomReference classes. Objects created from SoftReference contain soft references to their referents. You can use soft references to manage image and other memory-sensitive caches. Objects that you create from WeakReference contain weak references to their referents. You use weak references to obtain notification when significant objects are no longer strongly reachable. Finally, PhantomReference objects contain phantom references to their referents. You can use phantom references to perform post-finalization cleanup on the referents.

I encourage you to email me with any questions you might have involving either this or any previous article’s material. (Please, keep such questions relevant to material discussed in this column’s articles.) Your questions and my answers will appear in the relevant study guides.

In next month’s article, you will learn about nested classes.

Jeff Friesen has been involved with computers for the past 20 years. He holds a degree in computer science and has worked with many computer languages. Jeff has also taught introductory Java programming at the college level. In addition to writing for JavaWorld, he has written his own Java book for beginners — Java 2 By Example, Second Edition (Que Publishing, 2001; ISBN: 0789725932) — and helped write Special Edition Using Java 2 Platform (Que Publishing, 2001; ISBN: 0789720183). Jeff goes by the nickname Java Jeff (or JavaJeff). To see what he’s working on, check out his Website at http://www.javajeff.com.