Textbook, Chapter 9

Writing Server Scripts

Requirements for a server script:

Use your favorite programming language:

We have chosen to use Perl.

Stages in the life of a server script:

Specific aspects:

Perl Notes

Perl tutorial in lecture 3

Additional Perl notes in lab 4

Syntax is similar to C and various shell scripting languages

Has only three basic data structures:

Other than scalars, arrays, and associative arrays, Perl doesn't recognize data types. Examples:
$result = "1" + 2 + 3; # $result is now 6
$result = "unbelievable"; # $result is now a string 
Indexing: Initialization:
@a = ('tuna', 'cod', 'mackerel', 'herring'); 
%a = ( 'tuna'       => '9.45/lb', 
       'cod',       => '3.00/lb', 
       'mackerel',  => '8.40/lb', 
       'herring     => '2.30/lb'
 
     ); # note: this is a Perl5 shortcut 

# @a and %a are from different namespaces 
# same thing for a sclar variable called $a 

foreach $a (@a) {
  print $a, " --> ", $a{$a}, "\n"; 
} 
<> is used to retrieve a line of input from the standard input (into $_, magic variable).

Perl has lots of other magic variables, @_, $$ etc.

The statement

$foo =~ /pattern_to_match/ 
is a pattern matching operation.

The prefix & is used to invoke subroutines, as in

&process_query
A definition of a subroutine can occur anywhere in the file but it's better to group any such definitions at the end or the beginning of the file.

A call to system invokes a subshell and executes the specified command.

A call to eval evaluates the argument as a Perl expression and returns the result.

Perl variables placed inside double-quoted strings interpolate, single-quoted strings don't. Example:

$a = 'fred'; 
$b = "My name is $a"; 
$c = 'My Name is $a'; 
Backticks (`) as opposed to single quotes (') cause the indicated program to be run, and the output of the program, if any, is returned.

The syntax

print <<EOF;
Some
   text

EOF
specifies a here-document. Text is as in a double-quoted string. Same can be achieved with:
print qq{
Some 
   text 
}; 
Perl5 introduces references. Examples:
@a = ('a', 'b', 'c'); 
$ref = \@a; 
print $ref->[1]; # prints b ($a[1])

%a = ('a', 1, 'b', 2, 'c', 3); 
$ref = \%a; 
print $ref->{'a'}; # prints 1 

$aux = \@a; 
print $ref->{$aux->[0]}; # prints 1 
Perl5 has object-oriented syntax in which subroutine calls that are attached to variables become "methods". Here's an example in which we create a Dog object, then call some of its methods.
$pet = new Dog('beagle'); 
$pet->play_pet_trick;
$pet->eat('tuna'); 
Type man perltoot for
tucotuco.cs.indiana.edu% man perltoot
Reformatting page.  Wait... done
 
Perl Programmers Reference Guide                      PERLTOOT(1)
 
NAME
     perltoot - Tom's object-oriented tutorial for perl
 
DESCRIPTION
     Object-oriented programming is a big seller these days.
     Some managers would rather have objects than sliced bread.
     Why is that?  What's so special about an object?  Just what
     is an object anyway?
 
     An object is nothing but a way of tucking away complex
     behaviours into a neat little easy-to-use bundle.  (This is
     what professors call abstraction.) Smart people who have
     nothing to do but sit around for weeks on end figuring out
     really hard problems make these nifty objects that even
     regular people can use. (This is what professors call
     software reuse.)  Users (well, programmers) can play with
     this little bundle all they want, but they aren't to open it
     up and mess with the insides.  Just like an expensive piece
     of hardware, the contract says that you void the warranty if
--More--(1%)
CGI.pm is object-oriented. We only make use of it.

Reference: type man perlfunc for:

tucotuco.cs.indiana.edu% man perlfunc
Reformatting page.  Wait... done

Perl Programmers Reference Guide                      PERLFUNC(1)

NAME
     perlfunc - Perl builtin functions

DESCRIPTION
     The functions in this section can serve as terms in an
     expression.  They fall into two major categories: list
     operators and named unary operators.  These differ in their
     precedence relationship with a following comma.  (See the
     precedence table in the perlop manpage.)  List operators
     take more than one argument, while unary operators can never
     take more than one argument.  Thus, a comma terminates the
     argument of a unary operator, but merely separates the
     arguments of a list operator.  A unary operator generally
     provides a scalar context to its argument, while a list
     operator may provide either scalar and list contexts for its
     arguments.  If it does both, the scalar arguments will be
     first, and the list argument will follow.  (Note that there
     can ever be only one list argument.)  For instance, splice()
     has three scalar arguments followed by a list.                
Basic Scripts

Hello, World! from a file.

#!/usr/bin/perl
open (AB, "/u/dgerman/httpd/htdocs/index.html"); 
@file = <AB>; 
close(AB); 
$file = join('', @file); 
print "Content-type: text/html\n\n$file"; 
A CGI calendar. Date, time, random quotations.
$CAL     = '/usr/bin/cal'; 
$DATE    = '/usr/bin/date'; 
$year    = `$DATE +%Y`;
$cal_txt = `$CAL $year`; 

...

print "<pre>$cal_txt</pre>"; 

A redirection script.
All it needs to print is one line:
Location: http://www.cs.indiana.edu
Note: the URL could be relative.

The example in your book randomly picks a file. Other response header fields can be used too:

Retrieving server and browser information from within scripts.

                     Server software and communication protocols
SERVER_SOFTWARE      name and version number
GATEWAY_INTERFACE    CGI version number
SERVER_PROTOCOL      HTTP version number
                     Server configuration
SERVER_NAME          tucotuco.cs.indiana.edu
SERVER_PORT          19800
                     Information about user authentication
AUTH_TYPE            e.g., Basic Authentication
REMOTE_USER          username that goes with that
                     Information about the remote host
REMOTE_HOST          DNS name of remote host
REMOTE_ADDR          IP address of remote host
REMOTE_IDENT         name of remote user when using identd auth
                     Information about the current reques
REQUEST_METHOD       GET, HEAD, or POST
SCRIPT_NAME          virtual path to the script (URL)
PATH_INFO            extra URL path info after script name if any
PATH_TRANSLATED      extra path info converted in physical path
QUERY_STRING         the part that follows ? if present
CONTENT_TYPE         for POST requests only MIME type of the attached info
CONTENT_LENGTH       for POST only the length of the attached information 
                     Information generated by the browser
HTTP_ACCEPT          list of MIME types that browser accepts
HTTP_USER_AGENT      name and version number of browser
HTTP_REFERER         page user was viewing before
HTTP_COOKIE          magic cookie: string that can identify a browser session

HTTP_XXXXXXX         other headers browser decides to send

Printing the environment variables:
#!/usr/bin/perl
print "Content-type: text/plain\n\n";
foreach $var (keys %ENV) {
  print "$var --> $ENV{$var} \n"; 
} 
The <ISINDEX> tag
Creating and processing fill-out forms.
You need to get at the query string.

Not to be confused with QUERY_STRING in %ENV.

Query string included in the URL (GET)

$query = $ENV{'QUERY_STRING'}
Query string generated by an <ISINDEX> tag
@ARGV
Query string generated by a fill-out form (with POST)
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
A user-adjustable calendar.
Allow the user to specify the year(s).

Use all three methods.

CGI.pm: a perl library for writing CGI.
Installation

You don't need to do this on burrow but useful to know.

Usage

Retrieving and setting script parameters.

use CGI; 
$query = new CGI; 
foreach $name ($query->param) {
  $value = $query->param($name); 
  print "$name => $value\n"; 
} 
Calling CGI.pm methods with named arguments.
@veggies = $query->param('vegetables'); 
@veggies = $query->param(-name=>'vegetables'); 
Creating the HTTP header.
  print $query->header; 
# returns Content-type: text/html\n\n (default) 

# print $query->header(-type=>'image/gif');  
Creating HTML forms.
print $query->start_html; 
print $query->start_form(-action=>'/cgi-bin/printenv', 
                         -method=>'POST'); 
print $query->textfield (-name=>'username',
                         -size=> 12,
                         -value=> 'Fill this in!',
                         -override=>1); # sticky CGI.pm  
print $query->submit(-label=>'Push me!'); 
print $query->end_form; 
Same thing with one print statement:
print $query->start_html, # <--- notice the comma
      $query->start_form(-action=>'/cgi-bin/printenv', 
                         -method=>'POST'), # <--- notice the comma
      $query->textfield (-name=>'username',
                         -size=> 12,
                         -value=> 'Fill this in!',
                         -override=>1), # <--- notice the comma
      $query->submit(-label=>'Push me!'), # <--- notice the comma
      $query->end_form; # <--- semicolon, end of list
Access to environment variables.
if ($query->request_method eq 'GET') { ... }
else { ... }
Debugging scripts with CGI.pm library.
tucotuco.cs.indiana.edu% ./formelms
(offline mode: enter name=value pairs on standard input)
a=b
^D
At least you get an idea if they compile. You can also specify keywords lists or parameter lists from the prompt.
Documentation and examples.
  1. CGI Docs (Stein)
  2. Form Elements (Lecture 7)
  3. State Machines (Lecture 7)
  4. Feedback Form (Lecture 7)
  5. Clickable Images (Lecture 7)
  6. File Upload (Lecture 7)
Other query processing libraries A generic script template
The most common kind of CGI script The template for this is:
  1. print the header
  2. print the start of the HTML document
  3. attempt to fetch the query string
    if there is no query string
    this is the user's first access to this page so generate and return input document (form or isindex tag)
    else (there's a query string)
    do the work and synthesize a document giving the result of the request (or an acknowledgement that the request was processed)
  4. print the end of the HTML document, including a signature
Writing safe scripts
  1. don't trust your users
    don't trust your users' input

  2. check for an expected pattern
    if pattern not exactly as expected
A picture database search script
Pictures are indexed by keywords.

Search looks at keywords and then produces a document that has links to the images found.

Percentage of keywords found are also reported with the links.

Preserving state information between invocations of a script
Maintaining state with the URL
We've done this in the menu program (lecture 4).
Maintaining state with hidden fields
The simple calculator (lecture 4).

The state machines (lecture 7).

Saving information with a session ID

Using basic authentication

Cookies (yours or Netscape's)

Returning nontext documents from scripts
Making thumbnails images

Images from scratch (using GD.pm)

Covered in lecture 10 thoroughly.
Advanced techniques

Background jobs
The script forks a process.

The tricky part is to return the result to the user.

Use e-mail for message with notification or results (collect the address first) or provide an interface for the user to check if the results have become available.

Content negotiation
Protect the user from confusion.

Check what her/his browser accepts.

File uploads
Netscape extension first.

Format.

CGI.pm makes it easy (lecture 7).

So does Steve Brenner's cgi-lib.pl (Perl 4).

Frames
Scripts can directly specify which frame to load their output into by including a
Window-target:
in the HTTP header.
Server Push/Client Pull
Updating the page upon server/client request.

Neat feature before Java.

FastCGI
Extensions to the standard to make the connection more permanent.