Spring Semester 2004


Lecture Notes One: HTML, Apache, HTTP and Unix

Do you Yahoo?

Behind every web address stands a web server, likely an Apache web server, perhaps running under Unix. For every browser request, the reply comes encoded in HTML: the hypertext markup language that allows seamless interconnection of information over the network.

Not all HTML is typed by humans, some HTML pages are the output of programs written by humans. But the encoding is always the same. This class introduces you to Unix, HTML, and the Apache web server. At the end of this week you should be able to:

We start with HTML. To learn HTML you don't need a network connection. You only need a browser. (So one other key question becomes: do you know your browser?) We start with a review of basic HTML, which we summarize below.

1. Basic Document Structure

<html>
  <head><title>This is a title</title></head>
  <body>
    This is my first HTML document. 
  </body>
</html>
2. Attributes and values

<html>
  <head><title>This is a title</title></head>
  <body bgcolor=white>
    This is my first HTML document. 
  </body>
</html>
3. Font Formatting

<html>
  <head><title>This is a title</title></head>
  <body bgcolor=white>
    This is my first HTML document. <p>

    This <font color=red>word</font> will appear in red. 

    This one will be 
    <font size=+6>bigger</font> and somewhat 
    <font color="#0066ff">blue</font>. 
  </body>
</html>
4. Headings

<html>
  <head><title>This is a title</title></head>
  <body bgcolor=white>
    Here are some headings. 

    <h1> Heading One   </h1>
    <h2> Heading Two   </h2>
    <h3> Heading Three </h3>
    <h4> Heading Four  </h4>
    <h5> Heading Five  </h5>
    <h6> Heading Six   </h6>
    
    This is my first HTML document. 
  </body>
</html>
5. Lists

<html>
  <head><title>This is a title</title></head>
  <body bgcolor=white>

  Here some of the Pacers of a recent past: 

  <ul>
    <li> Mark Jackson  
    <li> Chris Mullin
    <li> Dale Davis 
    <li> Antonio Davis 
  </ul> 

  The order last year was: 

  <ol>
    <li> Los Angeles Lakers 
    <li> Indiana Pacers 
    <li> New York Knicks 
    <li> Utah Jazz 
  </ol> 

  </body>
</html>
6. Paragraphs, breaks, and preformatted text

<html>
  <head><title>This is a title</title></head>
  <body bgcolor=white>

    <p> New paragraph. 
    <p> New paragraph. 
    <p> New paragraph. <br> A line break. 
    <br> Another line break. 
    
This text  
will be rendered
normally, on one line. <p> 

<pre>This text  
appears in between 
preformatting tags
and therefore 
will stay
as 
typed.</pre>

    You get the idea. 

  </body>
</html>
7. Images (size, borders and alignment)

<html>
  <head><title>This is a title</title></head>
  <body>
    <img src="http://www.cc.columbia.edu/low3.gif">
  </body>
</html>
You can try to align the picture (perhaps adding some text on the page also)
<img src="http://www.cc.columbia.edu/low3.gif" align=right>
You can try changing the size of the picture:
<img src="http://www.cc.columbia.edu/low3.gif" width=34 
                                               height=24>
7.1 For the fun of it: applets

  1. Here's one (very nice, but old) example.

  2. Here's a more involved one.

8. Links

<html>
  <head><title>This is a title</title></head>
  <body bgcolor=white>

    <img src="http://www.cs.indiana.edu/classes/a113-dger/left.gif"> Do 
    you <a href="http://www.yahoo.com">Yahoo</a>? <p> 

  </body>
</html>
9. Tables

<html>
  <head><title>This is a title</title></head>
  <body bgcolor=white>
    <table border cellpadding=6>
      <tr> <td> (1, 1) </td> <td> (1, 2) </td> <td> (1, 3) </td> </tr> 
      <tr> <td> (2, 1) </td> <td> (2, 2) </td> <td> (2, 3) </td> </tr> 
    </table>
  </body>
</html>
Here's an example of cells spanning more than one column and more than one line.

<html>
  <head><title>This is a title</title></head>
  <body bgcolor=white>

    <table border cellpadding=6>
      <tr> <td rowspan=2 align=center> One </td> 
           <td colspan=2 align=center> Two </td> 
      </tr> 
      <tr> <td align=center bgcolor=lightgrey> Three </td> 
           <td align=center> Four </td> 
      </tr> 
    </table> 

  </body>
</html>
10. Frames

A web page is displayed one at a time - unless you use frames. Using frames allows you to display more than one web page at a time, although they may appear to be just one page. Frames divide a browser window into sections, with each section being an HTML document. We will look at frames a bit later, in a specific context.
11. Forms

I include a fairly comprehensive form, with all the elements discussed in class.

<html>
  <head><title>This is a title</title></head>
  <body>
    <form>
    Username: <input type="text"> <p> 
    Password: <input type="password"> <p> 
    What is the capital of Italy? 
    <blockquote>
    <input type="radio" name="question"> Milan
    <input type="radio" name="question"> Turin
    <input type="radio" name="question"> Rome
    </blockquote>
    Presidents of the United States (check all that apply): 
    <blockquote>
    <input type="checkbox" name="q2"> Ross Perot 
    <input type="checkbox" name="q2"> Bill Clinton 
    <input type="checkbox" name="q2"> George Bush 
    <input type="checkbox" name="q2"> Al Gore 
    </blockquote>
    <select name="capital"> 
      <option> What 
      <option> Milan 
      <option> Rome
      <option> Turin 
    </select> is the capital of Italy. 

</form>
  </body>
</html>
Emphasis is on the GUI aspects of the elements.

Note that the form is missing some of the attributes needed for CGI.

12. Adding sound and video.

Just one example.
An example of each, as a matter of fact.
13. Cascading Style Sheets

A relatively newer way to structure HTML documents is through the use of cascading style sheets, or simply CSS for short. The idea of using CSS as a formatting tool for HTML documents was first proposed in 1996; but it is just now finding widespread use and browser support. (Sometimes the Internet doesn't move as fast as you'd like).

Cascading style sheets allow you to determine how a variety of page elements will be displayed with precision, thus removing the limitations of HTML. This applies to font sizes, page positioning, and other page formatting options. It also introduces new elements that were not possible with just HTML. We'll take a look at CSS later in the module.

14. Creating image maps

We'll work a few image map examples also, a bit later.
15. Comments

<html>
  <head><title>This is a title</title></head>
  <body>
    <!-- This is a comment. -->
  </body>
</html>
16. Validators and other resources

Here are some typical resources (I will add more later):
So how does all of this work?

There are thousands of web servers throughout the world (wide web) but they are all acessible from any browser because they have all agreed to use a common protocol - the Hypertext Transfer Protocol (HTTP). HTTP is based on an exchange of requests and responses.

Each request can be thought of as a command, or action, which is sent by the browser to the server to be carried out. The server performs the requested service and returns its answer in the form of a response.

The components of a simple WWW interaction are the user, the client, and the server. The client acts as an intermediary between the user and the server.

Steps 1-7 detail the basic information flow in a simple HTTP transaction. Essentially the client requests a file and the server delivers it. The entire HTTP process takes place as a result of simple transactions of requests and responses.

  1. The user sees an interesting URL
    http://burrowww.cs.indiana.edu:17600/hello.html
    and clicks the hyperlink or types the URL into the browser.

  2. The browser interprets this command: It is different from printing, creating the bookmark, saving a file, changing any preferences, etc. This command (the equivalent Open Page in Netscape) says that the computer
    burrowww.cs.indiana.edu
    needs to be contacted on port 10200 and that the /hello.html file is needed.

    For this the browser sends the HTTP GET command to the server (not shown here - we'll look at how this works when we simulate this request process using telnet). The path to the requested file is relative to the server's document root).

  3. The browser sends the GET request to the server, indicating what file it needs. This request travels over the Internet, going from computer to computer until it reaches the web server's host, which is
    burrowww.cs.indiana.edu
    of the CSCI's burrow cluster.

    There's a network security aspect here that we will need to address later.

  4. The server receives and parses the request. It uses the file extension (.html) to determine the type of information in the file. The .html means that it will send back to the browser the file but it will first say:
    this file's Content-type: text/html

    You do not have to write this in the file, it is inferred by the server from the file's extension. But the server does send this information to the browser as part of the header, followed by the data (the actual file) as explained below.

  5. An HTTP response goes from server to the client. The headers that are part of the message indicate that the request was OK and that the data returned is of
    Content-type: text/html
    The headers are then followed by (a blank line and then by) the HTML data itself.

  6. The Content-type: part of the header tells the browser that the data is text formatted in HTML, so the browser renders the text appropriately, highlighting hyperlinks, etc.

  7. User views the HTML output and has the opportunity to select another hyperlink, starting the cycle over again and the circle is complete.
So how do we install a web server?

There are a number of things we need to do before installation:

  1. Create a user account on burrowww.cs.indiana.edu

  2. Familiarize ourselves with Unix

  3. Check the documentation on installation of Apache

The lectures and the lab this week will clarify all these issues.


Last updated: Jan 15, 2004 by Adrian German for A348/A548