Apache JServ: Faster, Safer Servlets

Back in my November 1996 column "Server Scripts: Can They Keep Up?" I discussed FastCGI, a protocol that speeds up CGI scripts tremendously by running them continuously in a separate process. More recently, I experimented briefly with Java servlets, an API for extending the Web server's capabilities with small modules of Java code (see "Java Servlets: Back to the Future," April 1998). But what about combining the flexibility of both? Well, I recently had a look at Apache JServ, a servlet engine designed to work with the Apache Web server for both UNIX and Windows NT. It combines the best features of FastCGI with the Java servlet API, and the results are compelling.

With few exceptions, servlet engines take one of two approaches. Either they embed a Java virtual machine (JVM) in a Web server written in C or another compiled language, or they simply write the whole Web server in Java. The approaches are similar in that the servlet engine and the Web server are combined together into a single process. But both approaches have some disadvantages. When Java is used as a Web server, performance on static pages suffers because interpreted Java is simply never as fast as compiled C. When servlets written by several authors run in the same Java process, there's a risk that they'll interfere with one another by calling each other's public methods or changing their instance variables.

Apache JServ takes a third approach by splitting the job into two parts. The first part, the servlet engine, runs as an independent process. It listens for incoming TCP-based requests from the Apache server using its own communications protocol, loads and runs the requested servlet, and sends the results back to the Web server for relaying to the client. The second part of the system is a small set of routines linked to Apache as a module. These routines know how to translate a requested URL into the name and address of a servlet engine, forward the request to the engine, and read the results that are returned. In effect, Apache is acting as a proxy server for the servlet engine. Apache continues to do what it does well, serving static pages and running CGI scripts, while the servlet engine handles the tasks it's best qualified for, running servlets.

Because Apache JServ decouples the Web server from the servlet engine, it offers a Webmaster the ability to create interesting Web architectures. There's no reason that you can't have several servlet engines running simultaneously and map different parts of the document tree onto each one. This lets you avoid the risk of servlets written by different authors interfering with one another. By running different servlet engines for each author (or group of authors), you keep the servlets separated. In fact, you can run the engines under different user and/or group IDs, taking full advantage of the protection offered by the operating system's account privileges.

Another interesting option made possible by Apache JServ's design is the ability of the Web server to talk to remote servlet engines as easily as it can talk to local ones. Instances of the servlet engine can be run on other machines on the network and the Web server side of Apache JServ can map portions of the URL hierarchy onto these remote engines. This gives you the flexibility to spread the load across multiple machines, or to build a "virtual servlet tree" from multiple servers.

Apache JServ supports the current Javasoft servlet 2.0 API, as well as all the neat features of servlets, including the ability to dynamically reload class files when they've changed, to parse server-side includes (.jhtml) files, and to chain a document from one servlet to another. Like Apache, it's an Open Source project run by a group of volunteers; hence it's completely free, from the Java Apache Project. The UNIX version is available in source-code form, and the Windows version is available as both source and binaries.

Giving It a Whirl

At the time this was written (March 1999), Apache JServ was in a "pre-beta 1.0" state. Hence some assembly was required in order to build it for my Linux system. The main things missing from the package were the GNU autoconfigure tools; however, a note on the download page quickly led me to the right FTP archive sites. Apache JServ also requires a recent version of Apache, JDK 1.1, and the current Java Servlet Development Kit (JSDK) from Sun. After getting all the required software in place, compiling and installing the server was just a matter of stepping through the instructions.

To test Apache JServ, I resurrected the code I had used to test various Java servlet engines in my previous servlet column (see "Speedy Server Scripts," September 1997). Surprisingly enough, the servlet compiled and ran without any modifications despite the fact that Java and the JSDK have both been through several changes since the code was first run! This is a testament to the stability of the servlet API, and is clearly good news for portability of servlets among different servers.

The great thing about servlets is their persistence. A servlet can set up and maintain an internal state in order to track click trails, maintain shopping carts, keep database handles open, or pretty much anything you want (within reason). This feature is enhanced by a recent addition to the servlet API, the HttpSession class. This class provides lightweight session objects for servlets. When a browser contacts a servlet for the first time, it can choose to create an HttpSession and populate it with information it wants to store. Java generates a unique session ID and passes it back to the browser, usually with a cookie, although URL rewriting is provided for browsers that don't accept cookies. Subsequently, when the browser again contacts the servlet, Java matches the browser up with its session ID and returns the correct session object to the servlet.

Step by Step

I decided to test out servlet sessions by writing a small state-conserving servlet, starting by using a Perl CGI script that I wrote ages ago:

Here's a picture for the resulting browser window.

It acts something like a shopping cart, but instead of tracking purchases, it lets you select one or more animals from a scrolling list of names in the left column of a table. When you click on "Add," the selected animals are placed on the list of animals in the right column of the table -- the "Zoo." You can place several instances of the same animal in the list, and its name will be pluralized correctly (in most cases). You may also select animals to be deleted by clicking on the "Delete" button.

Because the session is persistent, you can leave the servlet page and browse elsewhere on the Web. When you return, the "Zoo" will be in the same state as you left it -- as long as you haven't actually exited the browser, and the servlet engine hasn't been restarted in the intervening time. Servlet session cookies are not ordinarily written out to disk and therefore won't persist between browser launches. Presumably, you can subclass the Apache JServ Session class to achieve this effect, but I haven't tried it.

The code for the zoo servlet is shown below:

  1  import java.io.*;                                                          
  2  import java.util.*;
  3  
  4  import javax.servlet.*;
  5  import javax.servlet.http.*;
  6  
  7  public class Zoo extends HttpServlet {
  8  
  9    static String[] animals = {"baboon","bear","chicken","dodo","emu",
 10                               "ferret","giraffe","gnu","goat","hedgehog",
 11                               "hippopotamus","kangaroo","lion",
 12                               "lounge-lizard","moa","mouse","ostrich",
 13                               "pig","porcupine","raccoon","rat","sheep",
 14                               "squirrel","tiger","weasel","yak","zebra"
 15    };
 16  
 17    public void doGet (HttpServletRequest req, HttpServletResponse res)
 18      throws ServletException, IOException
 19      {
 20        
 21        // get current session values
 22        HttpSession sess = req.getSession(true);
 23        Dictionary  zoo  = (Dictionary) sess.getValue("session.zoo");
 24        if (zoo == null) zoo = new Hashtable();
 25  
 26        // if method is POST then process values
 27        if (req.getMethod().equals("POST"))
 28          if (processChanges(req,zoo))
 29            sess.putValue("session.zoo",zoo);
 30        
 31  
 32        // print out current settings
 33        res.setContentType("text/html");
 34        ServletOutputStream out = res.getOutputStream();
 35        
 36        out.println("<html>");
 37        out.println("<head><title>Welcome to the Zoo</title></head>");
 38        out.println("<body>");
 39        out.println("<h1>Welcome to the Zoo</h1>");
 40        printZoo(out,zoo);
 41        out.println("<p><a href=\"" + HttpUtils.getRequestURL(req) + "\">RELOAD PAGE</a></p>");
 42        out.println("</body></html>");
 43      }
 44  
 45    protected boolean processChanges(HttpServletRequest req, Dictionary zoo) {
 46      String [] choices = req.getParameterValues("choice");
 47      boolean changed = false;
 48  
 49      if (req.getParameter("add") != null) {
 50        for (int i = 0; i < choices.length; i++)
 51          if (zoo.get(choices[i]) == null)
 52            zoo.put(choices[i],new Integer(1));
 53          else {
 54            Integer newval = new Integer(((Integer)zoo.get(choices[i])).intValue() +1);
 55            zoo.put(choices[i],newval);
 56          }
 57        changed = true;
 58      }
 59  
 60      if (req.getParameter("delete") != null) {
 61        for (int i = 0; i < choices.length; i++)
 62          if (zoo.get(choices[i]) != null) {
 63            Integer newval = new Integer(((Integer)zoo.get(choices[i])).intValue() - 1);
 64            if (newval.intValue() <= 0 )
 65              zoo.remove(choices[i]);
 66            else
 67              zoo.put(choices[i],newval);
 68          }
 69        changed = true;
 70      }
 71  
 72      return changed;
 73    }
 74  
 75    protected void printZoo (ServletOutputStream out, Dictionary zoo)
 76      throws ServletException, IOException {
 77        out.println("<table border>");
 78  
 79        // print out the form
 80        out.println("<tr><th bgcolor=\"#f5deb3\">Animals to Add/Delete</th>" +
 81                    "<th bgcolor=\"#f5deb3\">Animals in the Zoo</th></tr>");
 82        out.println("<tr><td align=CENTER>"+
 83                    "<form method=POST><select name=\"choice\" size=10 MULTIPLE>");
 84        for (int i = 0; i < animals.length; i++)
 85          out.println("<option>" + animals[i] + "</option>");
 86        out.println("</select><br>");
 87        out.println("<input type=\"submit\" name=\"delete\" value=\"Delete\">");
 88        out.println("<input type=\"submit\" name=\"add\" value=\"Add\">");
 89        out.println("</form></td>");
 90  
 91        // print the current contents of the zoo
 92        out.println("<td valign=TOP align=LEFT><ul>");
 93        for (Enumeration e = zoo.keys(); e.hasMoreElements(); ) {
 94          String  animal = (String) e.nextElement();
 95          Integer count  = (Integer) zoo.get(animal);
 96          if (count.intValue() > 1 && !animal.endsWith("s"))
 97            animal = animal + "s";
 98          out.println("<li>" + count + " " + animal + "</li>");
 99        }
100        out.println("</ul></td></tr></table>");
101    }
102  
103    public void doPost (HttpServletRequest req, HttpServletResponse res)
104      throws ServletException, IOException {
105        doGet(req,res);
106    }
107  
108    public String getServletInfo() {
109      return "An example for WebTechniques";
110    }
111  
112  }
The first few lines bring in the various packages the servlet needs, including the javax.servlet and javax.servlet.http packages. Line 7 begins the definition of the Zoo class, which is a subclass of HttpServlet. It starts by declaring a static String array containing the list of animal names (lines 9 to 15). It then defines the doGet() method, which is called to process a GET request (lines 17 to 43). We'll see later that doGet() is also called by doPost() in order to process HTTP POST requests. doGet() is called with two arguments: an HttpServletRequest object, which contains information about the requested URL, and an HttpServletResponse object, which is used to send data back to the browser.

All the really interesting stuff happens in the first few lines of doGet(). In line 22, the method calls the HttpServletRequest object's getSession() method in order to return the HttpSession object. The true argument tells getSession() to create a new, empty session, if one isn't already present.

Line 23 attempts to fetch some data from the session. Session objects are organized as key/value pairs, where the key is a string and the value is any Java object at all. In this case, we're looking for a key called "session.zoo", which, if present, will hold a Dictionary object containing the list of animal names currently in the zoo. If present, we retrieve the object, cast it back to the Dictionary class, and store it in a local variable named zoo. Otherwise, we create a new zoo by constructing a Hashtable object. Hashtable is a specific implementation of Dictionary, and the only implementation defined in the standard Java library.

Lines 26 to 29 process the user's form submission. The servlet determines whether the page was called as the result of a POST request using the request object's getMethod() call. If so, the servlet calls an internal method named processChanges() in order to modify the zoo Dictionary in the appropriate way. If a change was made to the zoo, processChanges() returns true, and we write the changes back into the session by calling putValue() with the key/value pair (line 29).

The rest of the doGet() method is concerned with generating the HTML page. Line 33 calls setContentType() to set the MIME type of the outgoing document. The next line recovers an output stream from the response object (line 34). Everything printed to this output stream with the println() or print() methods will appear in the browser window. We then generate the HTML page with a series of calls to println(), followed by a call to an internal routine called printZoo(). This generates both the fill-out form and the list of animals in the zoo (line 40). The method ends by printing out a link that reloads the page (to illustrate the persistence of the session).

The processChanges() method can be found in lines 45 to 73. It retrieves the list of animals to add or delete by looking for a CGI variable named "choice". The request object's getParameterValues() method returns an array of values corresponding to the named CGI argument (line 46). The method then looks for a parameter named "add" (generated by the Add button), and, if it's present, adds all the animals listed in the choices array to the zoo. Otherwise, if a parameter named "delete" is present, the animals are deleted. Notice that the zoo is set up so that each key corresponds to an animal name and each value is the count of the number of animals of that type. This allows there to be multiple animals of each type in the zoo. If a particular type of animal is ever decremented to zero, it's removed from the zoo Dictionary entirely (line 65). The routine returns a Boolean indicating whether or not the zoo was changed.

The printZoo() method (lines 75 to 101) creates the table in the center of the page. It's largely a series of println() statements that generate the appropriate HTML for the table, the fill-out form, and the list of animals. It looks a little ugly because of the frequent need to escape quotation marks, but it should be straightforward to understand.

The final thing to consider is the doPost() method, which is defined in lines 103 to 106. This method is provided by the servlet API as a way to intercept and handle POST requests, as separate from GET requests. It's often easier to handle them both in a single routine, so I have the routine call doGet() directly.

Polished Pre-Beta

Running under Apache JServ, the zoo servlet is responsive and rock solid. It's much faster than the original Perl script running under vanilla CGI, and runs neck and neck with the Perl versions of the script running under mod_perl. The servlet's big win, of course, is that creating its persistent session was ridiculously easy. The original Perl object used a trick of storing the whole list of animals in the session cookie itself, and as a result was slightly shorter in overall code length than the servlet (about 60 lines). But Java's approach is smarter because it keeps the information on the server side and uses the cookie only to pass a session ID back and forth.

Even at this "pre-beta" stage, Apache JServ feels like a polished product. It's Open Source, and it runs on my favorite Web server. It definitely deserves a place among my standard development tools.

Online

  1. Apache JServ
  2. Java Servlet API
  3. Servlet Central

   


Entire contents copyright 1996-1999 Miller Freeman, Inc.