![]() |
![]() Fall Semester 2005 |
If you've ever written a complicated CGI script (let's say, at least no simpler than the second program in your second assignment), you know that the main inconvenience of the HTTP architecture is its stateless nature. Once an HTTP transaction is finished, the server forgets all about it. Even if the same remote user connects a few seconds later, from the server's point of view it's a completely new interaction and the script has to reconstruct the previous interaction's state. This makes even simple applications like shopping carts and multipage questionnaires a challenge to write.
CGI script developers have come up with a standard bag of tricks for overcoming this restriction. You can save state information inside the fields of fill-out forms, stuff it into the URI as additional path information, save it in a cookie, ferret it away in a server-side database, or rewrite the URI to include a session ID. In addition to these techniques, the Apache API allows you to maintain state by taking advantage of the persistence of the Apache process itself.
This chapter takes you on a tour of various techniques for maintaining state with the Apache API. In the process it also shows you how to hook your pages up to relational databases using the Perl DBI library. (We really won't touch more than what6 we call the very basics, yet the presentation will be thoroughly complete. For variations on these programs, including everything Apache module, in Perl or C, you'll have to pick up Stein and MacEachern, which is a supremely outstanding book).
1. Choosing the Right Technique.
The main issue in preserving state information is where to store it. Six frequently used places are shown in the following list. They can be broadly broken down into client-side techniques (items 1 through 3) and server-side techniques (items 4 through 6).
In client-side techniques the bulk of the state information is saved on the browser's side of the connection. Client-side techniques include those that store information in HTTP cookies and those that put state information in the hidden fields of a fill-out form. In contrast, server-side techniques keep all the state information on the web server host. Server-side techniques include any method for tracking a user session with a session ID.
Each technique for maintaining state has unique advantages and disadvantages. You need to choose the one that best fits your application. The main advantage of the client-side techniques is that they require very little overhead for the web server: no data structures to maintain in memory, no database lookups, and no complex computations. The disadvantage is that client-side techniques require the cooperation of remote users and their browser software. If you store state information in the hidden fields of an HTML form, users are free to peek at the information (using the browser's "View Source" command) or even to try to trick your application by sending a modified version of the form back to you. If you use HTTP cookies to store state information you have to worry about older browsers that don't support the HTTP cookie protocol and the large number of users (estimated to up to 20 percent) who disable cookies out of privacy concerns. If the amount of state information you want to state is large, you may also run into bandwith problems when transmitting the information back and forth.
Server-side techniques solve some of the problems of client-side methods but introduce their own issues. Typically you'll create a "session object" somewhere on the web server system. This object contains all the state information associated with the user session. For example, if the user has completed several pages of a multipage questionnaire, the session will hold the current page number and the responses to previous pages' questions. If the amount of state information is small, and you don't need to hold onto it for an extended period of time, you can keep it in the web server's process memory. Otherwise, you'll have to stash it in some long-term storage, such as a file or a database. Because the information is maintained on the server's side of the connection, you don't have to worry about user peeking or modifying it inappropriately.
However, server-side techniques are more complex than client-side ones. First, because these techniques must manage the information from multiple sessions simultaneously, you must worry about such things as database and file locking. Otherwise, you face the possibility of leaving the session storage in an inconsistent state when two HTTP processes try to update it simultaneously. Second, you have to decide when to expire old sessions that are no longer needed. Finally, you need a way to associate a particular session object with a particular browser. Nothing about a browser is guaranteed to be unique: not its software version number, nor its IP address, nor its DNS name. The browser has to be coerced into identifying itself with a unique session ID, either with one of the client-side techniques or by requiring users to authenticate themselves with usernames and passwords.
A last important consideration is the length of time you need to remember state. If you only need to save state across a single user session and don't mind losing the state information when the user quits the browser or leaves your site, then hidden fields and URI-based storage will work well. If you need state storage that will survive the remote user quitting the browser but don't mind if state is lost when you reboot the web server, then storing state in a web server process memory is appropriate. However, for long-term storage, such as saving a user's preferences over a period of months, you'll need to use persistent cookies on the client side or store the state information in a file or database on the server side.
2. Maintaining State in Hidden Fields
We now introduce the main example used in this chapter, an online hangman game. When the user first accesses the program, it chooses a random word from a dictionary of words and displays a series of underscores for each of the word's letters. The game prompts the user to type in a single letter guess or, if (s)he thinks (s)he knows it, the whole word. Each time the user presses return (or the "Guess" button), the game adds the guess to the list of letters already guessed and updates the display. Each time the user makes the wrong guess, the program updates the image to show a little bit more of the stick figure, up to six wrong guesses total (graphics courtesy Andy Wardley).
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
When the game is over, the user is prompted to start a new game. A status area at the top of the screen keeps track of the number of words the user has tried, the number of games he's won, and the current and overall averages (number of letters guessed per session).
This hangman game is a classic case of a web application that needs to maintain state across an extended period of time. It has to keep track of several pieces of information, including the unknown word, the letters that the user has already guessed, the number of wins, and a running average of guesses. In this section, we implement the game using hidden fields to record the persistent information. In later sections, we'll reimplement it using other techniques to maintain state.
You can play the game here. The complete code is discussed below. Much of the code is devoted to the program logic of choosing a new word from a random list of words, processing the user's guesses, generating the HTML to display the status information, and creating the fill-out form that prompts the user for input. This is a long script, so we'll have to step through it in stages.
The script starts in the standard way:
In order to compartmentalize the persistent information, we keep all the state information in a hash reference, called#!/usr/bin/perl use CGI; $q = new CGI; $WORDS = '/usr/share/lib/dict/words'; $TRIES = 6; # start the page, just to make sure print $q->header, $q->start_html(-title => 'Hangman w/ Hidden Fields', -bgcolor => 'white');
$state
. This hash contains six keys:
WORD
for the unknown word GUESSED
for the list of letters the user has already guessed, GUESSES_LEFT
for the number of tries that the user has left in this game GAMENO
for the number of games the user has played (the current one included) WON
for the number of games the user has won, and TOTAL
for the total number of incorrect guesses the user has made since the user has started playing.
We're now ready to start playing the game:
We first attempt to retrieve the state information by calling the subroutine# retrieve the state $state = &getState(); # reinitialize if we need to if (!$state || $q->param('restart')) { $state = &initialize($state) } # process the current guess, if any ($message, $status) = &process_guess($q->param('guess') || '', $state); # draw the picture &picture($state); # draw the statistics &status($message, $state);
get_state()
. If
this subroutine returns an undefined value or if the user presses the "restart" button, which appears when
the game is over, we call the initialize()
subroutine to pick a new unknown word and set the
state variables to their defaults. Next we handle the user's guess, if any, by calling the subroutine
process_guest()
. This implements the game logic, updates the state information, and returns a
two-item list consisting of a message to display to the user (something along the lines of "Good guess!") and
a status code consisting of one of the words "won", "lost", "continue", or "error". The main task is now to create the rest of the HTML page.
Using# draw the picture &picture($state); # draw the statistics &status($message, $state); # prompt the user to restart or for his next guess if ($status =~ /^(won|lost)$/) { # to restart &show_restart_form($state); } else { # for his/her next game &show_guess_form($state); } print $q->end_html;
CGI.pm
functions, we generate the HTTP header (at the top of the
script, and that's already done by now) and the beginning of the HTML code. We then
generate an <IMG>
tag using the state information to select which
"hanged man" picture to show and display the status bar. If the status code returned
by process_guess()
indicates that the user has completed the game, we
display the fill-out form that prompts the user to start a new game. Otherwise, we
generate the form that prompts the user for a new guess. Finally we end the HTML
page and exit.
Let's look at the relevant subroutines now, starting with initialize()
.
All the state maintenance is performed in the subroutines# called to initialize a whole new state object or to create a new game sub initialize { my $state = shift; $state = {} unless $state; $state->{WORD} = &pick_random_word(); $state->{GUESSES_LEFT} = $TRIES; $state->{GUESSED} = ''; $state->{GAMENO} += 1; $state->{WON} += 0; $state->{TOTAL} += 0; return $state; }
initialize()
,
get_state()
,
set_state()
.
initialize()
creates
a new empty state variable if one doesn't already exist, or resets just the per-game fields
if one does. The per-game fields that always get reset are WORD
, GUESSES_LEFT
,
and GUESSED
. The first field is set to new randomly chosen word, the second to the total number
of tries that the user is allowed, and the third to an empty has reference.
GAMENO
and TOTAL
need to persist across user games.
GAMENO
is bumped up by one
each time initialize()
is called.
TOTAL
is set to zero only
if it is not already defined. The
(re)initialized state variable is
now returned to the caller.
The# save the current state sub save_state { my $state = shift; foreach $key ("WORD", "GAMENO", "GUESSES_LEFT", "WON", "TOTAL", "GUESSED") { print $q->hidden(-name=>$key, -value=>$state->{$key}, -override=>1); } }
save_state()
routine is where we store the state information.
Because it stashes the
information in hidden fields, this subroutine must be called within a <FORM>
section.
Using CGI.pm
's hidden()
HTML shortcut, we produce a series of hidden tags whose
names correspond to each of the fields in the state hash. For the variables WORD
,
GAMENO
, GUESSES_LEFT
, and so on, we just call hidden
with
the name and current value of the variable.
The output of this subroutine looks something like the following HTML:
<INPUT TYPE="hidden" NAME="WORD" VALUE="tourists"> <INPUT TYPE="hidden" NAME="GAMENO" VALUE="2"> <INPUT TYPE="hidden" NAME="GUESSES_LEFT" VALUE="5"> <INPUT TYPE="hidden" NAME="WON" VALUE="0"> <INPUT TYPE="hidden" NAME="TOTAL" VALUE="7"> <INPUT TYPE="hidden" NAME="GUESSED" VALUE="eiotu">
get_state()
reverses this process, reconstructing the hash of state information
from the hidden form fields: This subroutine loops through each of the scalar variables, calls
param()
to retrieve its value from the query string, and assigns the value to the
appropriate field of the state variable.
The rest of the script is equally straightforward.# called to retrieve an existing state sub getState { return undef unless $q->param(); my $state = {}; foreach $key ("WORD", "GAMENO", "GUESSES_LEFT", "WON", "TOTAL", "GUESSED") { $state->{$key} = $q->param($key); } return $state; }
The process_guess()
subroutine (too long to be reproduced here, see full program code below) first maps
the unknown word and the previously guessed letters into hashes for easier comparison
later. Then it does a check to see if the user has already won the game but has not
moved on to a new game (which can happen if the user reloads the page).
The subroutine now begins to process the guess. It does some error checking on the user's guess to make sure that it is a valid series of lowercase letters and that the user hasn't already guessed it. The routine then checks to see whether the user has guessed a whole word or a single letter. In the latter case, the program fails the user immediately if the guess isn't an identical match to the unknown word. Otherwise, the program adds the letter to the list of guesses and checks to see whether the word has been entirely filled in. If so, the user wins. If the user has guessed incorrectly, we decrement the number of turns left. If the user is out of turns, (s)he loses. Otherwise, we continue.
The picture()
routine generates an <IMG>
tag pointing to an appropriate picture.
There are six static pictures named h0.gif
through h5.gif
and this routine generates
the right filename by subtracting the total number of tries the user is allowed from the number of turns (s)he
has left.
The status()
subroutine is responsible for printing out the game statistics and the word itself.
The most interesting part of the routine is toward the end, where it uses map()
to replace the
not-yet-guessed letters of the unknown word with underscores.
pick_random_word()
is the routine that chooses a random word from a file of words. Many Unix
systems happen to have a convenient list of about 38,000 words located in a file somewhere (our system
has it in /usr/share/lib/dict/words
). Each word appears on a separate line. We choose
the new word in a simple minded way, by reading the whole file in as a list then randomly
selecting a word as in helloFive
(although we could and should use an
even better algorithm, which has the drawback that needs to be explained
more, so we will stick with the simple-minded one for now).
Because the state information is saved in the document body, the save_state()
function has
to be called from the part of the code that generates the fill-out forms. The two places where this happens
are the routines show_guess_form()
and show_restart_form()
.
# print the fill-out form for requesting input sub show_guess_form { my $state = shift; print $q->start_form(), "Your guess: ", $q->textfield(-name=>'guess', -value=>'', -override=>1), $q->submit(value=>'Guess'); &save_state($state); print $q->end_form; }
show_guess_form()
produces the fill-out form that prompts the user for his guess. It calls
save_state()
after opening a <FORM>
section and before closing it.
# ask the user if (s)he wants to start over sub show_restart_form { my $state = shift; print $q->start_form(), "Do you want to play again?", $q->submit(-name=>'restart', -value=>'Another game'); delete $state->{"WORD"}; &save_state($state); print $q->end_form; }
show_restart_form()
is called after the user has either won or lost a game. It creates a
single button that prompts the user to restart. Because the game statistics have to be saved across game,
we call save_state()
here too. The only difference from show_guess_form()
is that
we explicitely delete the WORD
field from the state variable. This signals the script to generate
a new unknown word on its next invocation. Here, now, is the complete source code of this version of the
program.
Although this method of maintaining the hangman game's state works great, it has certain obvious limitations. The most severe of these is that it's easy for the user to cheat. All (s)he has to do is to choose the "View Source" command from his browser's menu bar and there's the secret word in full view, along with all other state information. The user can then use this knowledge of the word to immediately win the game, or (s)he can save the form to disk, change the values of the fields that keep track of the wins and losses, and resubmit the doctored form in order to artificially inflate the statistics.#!/usr/bin/perl # http://burrowww.cs.indiana.edu:14569/cgi-bin/stein/hidden use CGI; $q = new CGI; $WORDS = '/usr/share/lib/dict/words'; $TRIES = 6; # start the page, just to make sure print $q->header, $q->start_html(-title => 'Hangman Hidden Fields', -bgcolor => 'white'); # retrieve the state $state = &getState(); # reinitialize if we need to if (!$state || $q->param('restart')) { $state = &initialize($state) } # process the current guess, if any ($message, $status) = &process_guess($q->param('guess') || '', $state); # draw the picture &picture($state); # draw the statistics &status($message, $state); # prompt the user to restart or for his next guess if ($status =~ /^(won|lost)$/) { # to restart &show_restart_form($state); } else { # for his/her next game &show_guess_form($state); } print $q->end_html; #------------(subroutines)-------------- # called to retrieve an existing state sub getState { return undef unless $q->param(); my $state = {}; foreach $key ("WORD", "GAMENO", "GUESSES_LEFT", "WON", "TOTAL", "GUESSED") { $state->{$key} = $q->param($key); } return $state; } # called to initialize a whole new state object or to create a new game sub initialize { my $state = shift; $state = {} unless $state; $state->{WORD} = &pick_random_word(); $state->{GUESSES_LEFT} = $TRIES; $state->{GUESSED} = ''; $state->{GAMENO} += 1; $state->{WON} += 0; $state->{TOTAL} += 0; return $state; } # called to process the user's guest sub process_guess { my ($guess, $state) = @_; # lose immediately if user has no more guesses left return ('', 'lost') unless $state->{"GUESSES_LEFT"} > 0; # create hash containing the letters guessed thus far my %guessed = map { $_ => 1 } $state->{"GUESSED"} =~ /(.)/g; # create hash containing the letters in the original word my %letters = map { $_ => 1 } $state->{"WORD"} =~ /(.)/g; # return immediately if user has already guessed the word return ('', 'won') unless grep (!$guessed{$_}, keys %letters); # do nothing more (stop here) if no guess is provided return ('', 'continue') unless $guess; # this section processes individual letter guesses $guess = lc $guess; return ("Not a valid letter or word!", 'error') unless $guess =~ /^[a-z]+$/; return ("You already guessed that letter!", 'error') if ($guessed{$guess}); # this section is called when the user guesses the whole world if (length($guess) > 1 && $guess ne $state->{WORD}) { $state->{TOTAL} += $state->{GUESSES_LEFT}; return (qq{You lose. The word was "$state->{WORD}."}, 'lost'); } # update the list of guesses foreach ($guess =~ /(.)/g) { $guessed{$_}++; } $state->{GUESSED} = join('', sort keys %guessed); # correct guess -- word completely filled in unless (grep(!$guessed{$_}, keys %letters)) { $state->{WON}++; return (qq{You got it! The word was "$state->{WORD}."}, 'won'); } # incorrect guess if (! $letters{$guess}) { $state->{TOTAL}++; $state->{GUESSES_LEFT}--; # user runs out of turns return (qq{The jig is up. The word was "$state->{WORD}".}, 'lost') if $state->{GUESSES_LEFT} <= 0; return ('Wrong guess!', 'continue'); } # correct guess but word still incomplete return ('Good guess!', 'continue'); } # create the cute hangman picture sub picture { my $state = shift; my $tries_left = $state->{GUESSES_LEFT}; my $picture = sprintf("/h%d.gif", $TRIES - $tries_left); print $q->img( {-src=>$picture, -align=>'LEFT', -alt=>"[$tries_left tries_left]" } ); } # print the status sub status { my ($message, $state) = @_; print qq { <table width=100%> <tr> <td> <b> Word #: </b> $state->{GAMENO} ($state->{WORD}) </td> <td> <b> Guessed: </b> $state->{GUESSED} </td> </tr> <tr> <td> <b> Won: </b> $state->{WON} </td> <td> <b> Current average: </b> }, sprintf("%2.3f", $state->{TOTAL} / $state->{GAMENO}), qq{ </td> <td> <b> Overall average: </b> }, $state->{GAMENO} > 1 ? sprintf("%2.3f", ($state->{TOTAL} - ($TRIES - $state->{GUESSES_LEFT} ) ) / ($state->{GAMENO} - 1) ) : '0.000', qq{ </td> </tr> </table> }; my %guessed = (); my @guessed = $state->{GUESSED} =~ /(.)/g; foreach $letter (@guessed) { $guessed{$letter} = 1; } # instead of my %guessed = map { $_ => 1 } $state->{GUESSED} =~ s/(.)/g; print $q->h2("Word:", map { $guessed{$_} ? $_ : '_' } $state->{"WORD"} =~ /(.)/g ); print $q->h2($q->font({-color=>'red'}, $message)) if $message; } # ask the user if (s)he wants to start over sub show_restart_form { my $state = shift; print $q->start_form(), "Do you want to play again?", $q->submit(-name=>'restart', -value=>'Another game'); delete $state->{"WORD"}; &save_state($state); print $q->end_form; } # print the fill-out form for requesting input sub show_guess_form { my $state = shift; print $q->start_form(), "Your guess: ", $q->textfield(-name=>'guess', -value=>'', -override=>1), $q->submit(value=>'Guess'); &save_state($state); print $q->end_form; } # pick a word, any word sub pick_random_word { open (AB, $WORDS); my @words = <AB>; close(AB); my $chosenWord = $words[int(rand($#words + 1))]; chop($chosenWord); return lc $chosenWord; } # save the current state sub save_state { my $state = shift; foreach $key ("WORD", "GAMENO", "GUESSES_LEFT", "WON", "TOTAL", "GUESSED") { print $q->hidden(-name=>$key, -value=>$state->{$key}, -override=>1); } }
These considerations are not too important for the hangman game, but they become real issues in applications where money is at stake. Even with the hangman game we might worry about the user tampering with the state information if we were contemplating turning the game into an Internet tournament. Techniques for preventing user tampering are discussed later in this chapter.