Spring Semester 2002


Lecture Notes One: What computers can do and how. Programming languages. Java.
Computers can do lots of things. They can add millions of numbers in the twinkling of an eye. They can outwit chess grandmasters. They can guide weapons to their targets. They can book you onto a plane between a guitar-strumming nun and a non-smoking physics professor. Some can even play the bongoes. That's quite a variety!

What does the inside of a computer look like? Crudely, it will be built out of a set of simple, basic elements. Once you get down to the guts of computers you find that, like people, they tend to be more or less alike. They can differ in their functions, and in the nature of their inputs and outputs -- one can produce music, another a picture, while one can be set running from a keyboard, another by the torque from wheels of an automobile -- but at heart they are very similar.

Viewed this way, the variety in computers is a bit like the variety in houses: a Beverly Hills condo might seem entirely different from a garage in Yonkers, but both are built from the same things -- bricks, mortar, wood, sweat -- only the condo has more of them, and arranged differently according to the needs of the owner. At heart they are very similar. For today's computers to perform a complex task, we need a precise and complete description of how to do that task in terms of a sequence of simple basic procedures. This instructing has to be exact and unambiguous. In life, of course, we never tell each other exactly what we want to say; we never need to, as context, body language, familiarity with the speaker, and so on, enables us to "fill the gaps" and resolve any ambiguities in what is said.

Computers, however, can't yet "catch on" to what is being said, the way a person does. They need to be told in excruciating detail exactly what to do. I can see the emphasis.

Perhaps one day we will have machines that can cope with approximate task descriptions, but in the meantime we have to be very prissy about how we tell computers to do things. A computer program tells a computer, in minute detail, the sequence of steps that are needed to fulfill a task. The act of designing and implementing these programs is called computer programming. In this course, you will learn how to program a computer -- that is, how to direct a computer to execute tasks.

Today's computer programs are so sophisticated that it is hard to believe that they are all composed of extremely primitive operations. Only because a program contains a huge number of such operations, and because the computer can execute them at great speed, does the computer user have the illusion of smooth interaction.

"He does it very stupidly, but he also does it very quickly and that's the point of all this: the inside of a computer is as dumb as hell but it goes like mad!" It can perform very many millions of simple operations a second and is just like a very fast (but) dumb file clerk.

To use a computer you do not need to do any programming. You can drive a car without being a mechanic and toast bread without being an electrician. Many people who use computers every day in their careers never need to do any programming.

Of course, a professional computer scientist or software engineer does a great deal of programming. Since you are taking this first course in computer science, it may well be your career goal to become such a professional. To write a computer game with motion and sound effects or a word processor that supports fancy fonts and pictures is a complex task that requires a team of many highly skilled programmers. Your first programming efforts will be more mundane.

The concepts and skills you learn in this course form an important foundation, and you should not be disappointed if your first programs do not rival the sophisticated software that is familiar to you. Actually, you will find that there is an immense thrill even in simple programming tasks. It is an amazing experience to see the computer precisely and quickly carry out a task that would take you hours of drudgery, to make small changes in a program that lead to immediate improvements, and to see the computer become an extension of your mental powers.

To understand the programming process, you need to have a rudimentary understanding of the building blocks that make up a computer. At the heart of the computer lies the central processing unit (CPU). All data must travel through the CPU whenever it is moved from one location to another. (There are a few technical exceptions to this rule; some devices can interact directly with memory).

The computer stores data and programs in memory. There are two kinds of memory. Primary storage is fast but expensive; it is made from memory chips: so-called random-access memory (RAM) and read-only memory (ROM). Secondary storage, usually a hard disk, provides less expensive storage that persists without electricity.

You will often use another kind of magnetic storage device: a so-called floppy disk or diskette. Originally floppy disks had a fairly low capacity, but recently high-capacity floppies such as the Zip disk and the Superdisk have become popular. Because a floppy disk is not an integral part of the computer system, it is called an external storage device.

Floppies are OK but audio and video information takes up much more space than a floppy disk can provide. That kind of information is typically distributed on a CD-ROM or DVD (digital versatile disk).

To store large amounts of user data, data tapes are commonly used. Data tapes are inexpensive and can hold lots of information, but they are slow.

Some computers are self-contained units, whereas others are interconnected through networks. Through the network cabling, the computer can read programs from central storage locations or send data to other computers. For the user of a networked computer it may not even be obvious which data reside on the computer itself and which are transmitted through the network.

To interact with a human user, a computer requires other peripheral devices. The computer transmits information to the user through a display screen, loudspeakers, and printers. The user can enter information and directions to the computer by using a keyboard or a pointing device such as a mouse.

The central processing unit, RAM memory, and the electronics controlling the hard disk and other devices are interconnected through a set of electrical lines called a bus. Programming in Java we won't have anything to do directly with this, but when a program is started, it is brought into main memory, from which the CPU can read it. The CPU reads the program an instruction at a time. As directed by these instructions, the CPU reads data, modifies them, and writes them back to RAM memory or to hard disk.

Some program instructions will cause the CPU to place dots on the display screen or printer or to vibrate the speaker. As these actions happen many times over and at a great speed, the human user will perceive images and sound. Some program instructions read user input from the keyboard or mouse. The program analyzes the nature of these inputs and then executes the next appropriate instructions.

On the most basic level, computer instructions are extremely primitive. CPUs from different vendors, such as the Intel Pentium or the Sun SPARC, have different sets of machine instructions. To enable Java programs to run on multiple CPUs without modification, most Java compilers generate a set of machine instructions for a so-called "Java virtual machine", an idealized CPU that is then simulated by a program run on the actual CPU.

The difference between actual and virtual machine instructions is not important to us -- all you need to know is that machine instructions are very simple and can be executed very quickly. Java is a high-level programming language. In Java the programmer expresses the idea behind the task that needs to be performed in a language that resembles both natural language (somewhat) and (to a greater extent) mathematics. Then, a special computer program, called a compiler translates the higher-level description into machine instructions (called bytecode) for the Java virtual machine.

Compilers are sophisticated programs. Thanks to them programming languages are independent of a specific computer architecture. Still, they are human creations, and as such they follow certain conventions. To ease the translation process, those conventions are much stricter than they are for human languages.

When you talk to another person, and you scramble or omit a word or two, your conversation partner will usually still understand what you have to say. Compilers are less forgiving.

Just as there are many human languages, there are many programming languages. This provides a useful source of analogy. Let me ask you this: which is the best language for describing something? Say: a four-wheeled gas-driven vehicle.

Of course, most languages, at least in the West, have a simple word for this: we have "automobile", the English say "car", the French "voiture", and so on. However, there will be some languages which have not evolved a word for "automobile", and speakers of such tongues would have to invent some, possibly long and complex, description for what they see, in terms of their basic linguistic elements.

Yet none of these descriptions is inherently "better" than any of the others: they all do their job, and will only differ in efficiency. We needn't introduce democracy just at the level of words. We can go down to the level of alphabets.

What, for example, is the best alphabet for English? That is, why stick with our usual 26 letters? Everything we can do with these, we can do with three symbols -- the Morse code, dot, dash, and space;

... or two -- a Baconian cipher, with A through Z represented by five-digit binary numbers. So we see that we can choose our basic set of elements with a lot of freedom, and all this choice really affects is the efficiency of our language, and hence the sizes of our books; there is no "best" language or alphabet -- each is logically universal, and each can model any other.

Same with programming languages, and Java is no exception. Like C (another popular programming language), the Java language arose from the ashes of a failing project. In the case of Java, the situation was an anticipated market that failed to materialize.

Imagine that you've worked on a highly ambitious state of the art electronics project for 2 years. Against all odds, you have built a working prototype of the hardware driven by a custom-developed programming language. It's a hand-held device that can control consumer electronics like TV-top boxes for interactive cable. A TV-top box is the name given to the extra-electronic gizmo you'd need to decode the signal when they bring 600 channels of cable to your house.

So what you have is an extremely intelligent remote control, possibly providing two-way communication from your house to the cable company. The only problem is that at that time (1992) people were beginning to realize that there was no market for interactive cable service. That was the situation in which the Sun R&D team found themselves. Despite wooing potential customers like Mitsubishi, France Telecom, and Time-Warner, the orders either went elsewhere, or did not materialize at all. By mid-1993 interactive TV was a big expensive bust and everyone knew it.

Mid-1993 was also when the first Mosaic browser came out, although few people had yet paid any attention to it. Funding for the box project was about to be cut, and the team (led by James Gosling and Patrick Naughton) disbanded to other projects, when Bill Joy and Eric Schmidt conceived the idea of dropping the hardware, and adapting the software to work smoothly with the Internet.

Sun executive Phil Samper was persuaded to fund further development for one year to the tune of $5 million. Everything was to revolve around the WWW. Whatever the software project evolved into, it had to feature the Web as its focal point. By now it was 1994. Gosling and a small number of colleagues set to work on this new challenge. They worked at a furious pace and in great secrecy throughout the year. By Christmas they had a working translator, the key libraries, and a web browser as a proof of concept. In January 1995 the language was named "Java".

The HotJava browser, which was shown to an enthusiastic crowd at the SunWorld exhibition in 1995, had one unique property: It could download programs, called applets, from the web and run them. Applets let web developers provide a variety of animation and interaction and can greatly extend the capabilities of the web page. In 1996 both Netscape and Microsoft supported Java in their browsers. Since then Java has grown at a phenomenal rate.

Programmers have embraced the language because it is simpler than its closest rival, C++. In addition to the programming language itself, Java has a rich library that makes it possible to write portable programs that can bypass proprietary operating systems. At this time Java has already established itself as one of the most important languages for general-purpose programming as well as for computer science instruction. Was Java designed for beginners?

No. Java is an industrial language. And because Java was not specifically designed for students, no thought was given to make it really simple to write basic programs. A certain amount of technical machinery is necessary in Java to write even the simplest programs. To understand what this technical machinery does, you need to know something about programming.

This is not a problem for a professional programmer with prior experience in another programming language, but not having a linear learning path is a drawback for the student. As you learn how to program in Java, there will be times when you will be asked to be satisfied with a preliminary explanation and wait for complete details in a later chapter.

Furthermore, you cannot hope to learn all of Java in one semester. The Java language itself is relatively simple, but Java contains a vast set of library packages that are necessary to write useful programs. There are packages for graphics, user interface design, cryptography, networking, sound, database storage, and many other purposes. Even expert Java programmers do not know the contents of all the package -- they just use those that are needed for particular projects.

Taking this class, you should expect to learn a good deal about the Java language and about the most important packages. Keep in mind though that the purpose of this course is not to make you memorize Java minutiae, but to teach you how to think about programming.

All right, let's see a program written in Java.
How about this one?
public class Hello 
{ public static void main(String[] args) 
  { System.out.println("Hello, and welcome to A201!"); 
  } 
}

What can it do? It displays a simple greeting.

I'd like to see that. You need to create a program file, compile it and then run it.

Here's the session in Unix:
frilled.cs.indiana.edu%pico Hello.java
frilled.cs.indiana.edu%javac Hello.java
frilled.cs.indiana.edu%java Hello
Hello, and welcome to A201!
frilled.cs.indiana.edu%
What's pico?

It's a Unix editor. That's how it all gets started. You enter the program statements into a text editor. The editor stores the text and gives it a name such as Hello.java ... which you then compile.

Yes, with javac. When you compile your program, the compiler translates the Java source code (that is the text, or statements that you wrote) into so-called bytecode, ... which consists of virtual machine instructions and some other pieces of information on how to load the program into memory prior to execution.

The bytecode for a program is stored in a separate file with extension .class for example the bytecode for the program we wrote will be stored in Hello.class and you should look for this file on your system after compilation. What's frilled?

Just the prompt on the Unix machine we were on at the time. On your computer it might be C:\ or some such thing. What's next? The Java bytecode file contains the translation of your program in Java virtual machine terms.

A Java interpreter loads the bytecode of the program you wrote, starts your program, and loads the necessary library bytecode files as they are required. That's java

Yes. Your programming activity centers around these steps. You start in the editor, writing the source file. You compile the program and look at the error messages. You go back to the editor and fix the syntax errors. When the compiler succeeds, you run the executable file. If you find an error, you try to debug your program to find the cause of the error. Once you find the cause of the error, you go back to the editor and try to fix it. You compile and run again to see whether the error has gone away. If not, you go back to the editor.

This is called the edit - compile - debug loop, and you will spend a substantial amount of time in this loop in the months and years to come. Let me draw a picture of that.
That's called a flowchart. I know...


Last updated: Jan 9, 2002 by Adrian German for A201