|
Spring Semester 2007
|
Lecture Notes Eight: Principles of Database Design. More Examples.
Usually somebody has (or anticipates having) a significant amount of data.
They then want to be able to access it on a daily (or perhaps more often) basis,
efficiently: to update it, or to extract information from the existing data.
|
The first
step (always) is to imagine yourself keeping track of this information using just paper
and pencil and assuming you have infinite precision and endurance. You then design a
database.
|
Designing a database is a fairly well established process by now. We are going to illustrate
an information design methodology through a couple of examples, very much like we did last time.
|
The information-level design methodology we promote involves representing individual
user views, refining them to eliminate any problems, and then merging them into a cumulative design.
|
After you have represented and merged all user views you can complete the cumulative design for the entire database.
|
Let's see some examples.
|
Ray Henry, the owner of a bookstore chain named Henry Books, gathers and organizes
information about branches, publishers, authors and books. Each branch has a number that uniquely identifies the
branch. In addition, Ray tracks the branch's name, location and number of employees. Each publisher has a code that uniquely
identifies the publisher. In addition, Ray tracks the publisher's name and city. The only user for the Book database is Ray, but you
don't want to treat the entire project as a single user view. Ray has provided you with all the reports the system must produce, and you
will treat each report as a user view. Ray has given you the following requirements:
- User View 1 Requirements: For each publisher, list the publisher code, publisher name, and the city in which the publisher is located
- User View 2 Requirements: For each branch, list the number, name, location and number of employees
- User View 3 Requirements: For each book, list its code, title, publisher code and name, price and whether it is a paperback
- User View 4 Requirements: For each book, list its code, title, price and type. In addition, list the book's authors and their names. If a
book has more than one author, all names must appear in the order in which they are listed on the book's cover. The author order is not always
alphabetical.
- User View 5 Requirements: For each branch, list its number and name. In addition, list the code and title of each book currently in the
branch
as well as the number of copies the branch has available
- User View 6 Requirements: For each book, list its code and title. In addition, for each branch that currently has the book in stock, list
the number
and name of the branch along with the number of copies available.
To transform each user view into DBDL (database design language), examine the requirements and create the necessary entities, keys, and relationships.
Let's take them one by one.
|
User View 1: For each publisher, list the publisher code, publisher name, and the city in which the publisher is located. The only
entity in this user view Publisher, as follows:
|
Publisher (PublisherCode, PublisherName, City)
This table is in third normal form; the primary key is PublisherCode. There are no alternate or foreign keys.
|
Assume that Ray wants to be able to access a publisher rapidly on the basis of its name. You'll need to specify the PublisherName column as a secondary key.
|
Because this is the first user view, there is no previous cumulative design.
|
Thus, at this point, the new cumulative design will consist only of the design for this user view, as shown below. There is no need for an E-R diagram at this point.
|
Publisher (PublisherCode, PublisherName, City)
SK PublisherName
User View 2: For each branch, list the number, name, location, and number
of employees. The only entity in this user view is Branch, and it is written
as follows:
|
The table is in third normal form. The primary key is BranchNum, and there are no alternate or foreign keys.
|
Branch (BranchNum, BranchName, BranchLocation, NumEmployees)
Ray wants to be able to access a branch rapidly on the basis of its name, so you;ll make the BranchName column a secondary key.
|
Because there is no table in the cumulative design with the BranchNum column as its primary key, you can add the Branch table to the cumulative design during
the merge step, as shown below. Again, there is no need for an E-R diagram with this simple design.
|
Publisher (PublisherCode, PublisherName, City)
SK PublisherName
Branch (BranchNum, BranchName, BranchLocation, NumEmployees)
SK BranchName
User 3 View: For each book list its code, title, publisher code and name, price and whether it is paperback.
|
To satisfy this user requirement, you'll need to create entities for publishers and books and establish a one-to-many relationship between them.
This leads to the following:
|
Publisher (PublisherCode, PublisherName)
Book (BookCode, Title, PublisherCode, Price, Paperback)
The PublisherCode column in the Book table is a foreign key identifying the publisher. Merging these tables
with the ones you already created does not add any new columns to the Publisher table, but it does add columns
to the Book table.
|
The result of merging the Book table with the cumulative design is shown below. Assuming that Ray will need to access
books based on their titles, you'll designate the Title column as a secondary key.
|

User View 4: For each book, list its code, title, price, and type. In addition, list the book's authors
and their names. If a book has more than one author, all names must appear in the order in which they are
listed on the book's cover.
|
There are two entities in the user view for books and authors. The relationship between them is many-to-many (one author can write many books and one book
can have many authors). Creating tables for each entity and the relationship between them gives:
|
Author (AuthorNum, AuthorLast, AuthorFirst)
Book (BookCode, Title, Price)
Wrote (BookCode, AuthorNum, Sequence)
The Author and Wrote tables are new; merging the Book table adds nothing new.
|
Because it may be important to find an author based on the author's last name, the AuthorLast column is a secondary key. The result of the merge step
is shown below.
|

User View 5: For each branch, list its number and name. In addition, list the code and title of each book currently in the branch as well as the number of
copies the branch has available.
|
Suppose you decide that the only entity mentioned in this requirement was for information about branches. You would create the following table:
|
Branch (
You would then add the BranchNum column as the primary key, producing the following
|
The other columns include the branch name as well as the book code, book title, and number of units on hand.
|
Branch (BranchNum,
Because a branch will have several books, the last three columns will form a repeating group.
|
Thus, you have the following:
|
Branch (BranchNum, BranchName, (BookCode, Title, OnHand))
You convert this table to first normal form by removing the repating group and expanding the primary key.
This gives:
|
|
Branch (BranchNum, BranchName, BookCode, Title, OnHand)
|
In this table, you have the following functional dependencies:
BranchNum ---> BranchName
BookCode ---> Title
BranchNum, BookCode ---> OnHand
|
The table is not in second normal form because some columns depend on just a portion of the primary key.
Converting to second normal form gives:
|
You can name the new table Inventory because it represents each branch's inventory.
In the Inventory table, the BranchNum column is a foreign key that identifies the Branch table,
and the BookCode column is a foreign key that identifies the Book table.
|
Branch (BranchNum, BranchName)
Book (BookCode, Title)
Inventory (BranchNum, BookCode, OnHand)
In other words, in order for a row to exist in the Inventory table, both the branch number and the book
code must already be in the database.
|
You can merge this Branch table with the existing Branch table without adding any new columns or relationships to the database.
After adding the Inventory table to the existing cumulative design, you have the design shown below.
|

How would the design for this user view have turned out if you began with two entities, Branch and Book, instead of just the single entity Branch?
|
In the first step, you would create the following tables:
Branch (
Book (
|
Adding the primary keys would give:
Branch (BranchNum,
Book (BookCode,
|
Adding the other columns would give:
Branch (BranchNum, BranchName)
Book (BookCode, Title)
|
Finally, you have to implement the relationship between
the Branch and Book tables. Because a branch can have many
books and a book can be in stock at many branches, the relationship is many to many.
|
To implement a many to many relationship, you add a new table whose primary key is the combination
of the primary keys of the other tables. Doing this, you produce the following:
Branch (BranchNum, BranchName)
Book (BookCode, Title)
Inventory (BranchNum, BookCode)
|
Finally, you add any column that depends on both the BranchNum and PublisherCode columns to the Inventory table, giving:
|
Thus, you end up with exactly the same collection of tables, which illustrates a point made earlier: There's more than one way
of arriving at the correct result.
|
Branch (BranchNum, BranchName)
Book (BookCode, Title)
Inventory (BranchNum, BookCode, OnHand)
User View 6: For each book, list its code and title. In addition, for each branch currently having
the book in stock, list the number and name of each branch along with the number of copies available.
|
This user view leads to precisely the same set of tables that were created for User View 5.
|
You have satisfied all the requirements, and the design shown in the preceding picture represents the complete information-level design.
|
Here's the database, with data, and an exercise for you: can you phrase the following queries in English?
|
SELECT AuthorNum, AuthorLast
FROM Author;
SELECT *
FROM Branch;
SELECT PublisherName
FROM Publisher
WHERE City='Boston';
SELECT PublisherName
FROM Publisher
WHERE NOT City='Boston';
SELECT BranchName
FROM Branch
WHERE NumEmployees>=9;
SELECT BookCode, Title
FROM Book
WHERE Type='SFI';
SELECT BookCode, Title
FROM Book
WHERE Type='SFI'
AND Paperback;
or
SELECT BookCode, Title
FROM Book
WHERE Type='SFI'
AND Paperback=Yes;
or
SELECT BookCode, Title
FROM Book
WHERE Type='SFI'
AND Paperback=True;
SELECT BookCode, Title
FROM Book
WHERE Type='SFI'
OR PublisherCode='PE';
SELECT BookCode, Title, Price
FROM Book
WHERE Price BETWEEN 5 and 10;
SELECT BookCode, Title
FROM Book
WHERE Type='FIC'
AND Price<10;
SELECT BookCode, Title, .85 * Price AS DiscountedPrice
FROM Book;
SELECT BookCode, Title
FROM Book
WHERE Type IN ('SFI', 'HOR', 'ART');
SELECT BookCode, Title, PublisherCode
FROM Book
ORDER BY PublisherCode, Title;
SELECT COUNT(*)
FROM BOOK
WHERE Type='SFI';
SELECT AVG(Price)
FROM Book;
SELECT BookCode, Title, Book.PublisherCode, PublisherName
FROM Book, Publisher
WHERE Book.PublisherCode=Publisher.PublisherCode;
SELECT Title, Price
FROM Book, Publisher
WHERE Book.PublisherCode=Publisher.PublisherCode
AND PublisherName='Taunton Press';
SELECT Title, BookCode
FROM Book, Publisher
WHERE Book.PublisherCode=Publisher.PublisherCode
AND PublisherName='Putnam Publishing Group'
AND Price>15;
SELECT BookCode, Title, PublisherCode, Price
INTO Fiction
FROM Book
WHERE Type='FIC';
UPDATE Fiction
SET Price=14.50
WHERE Price=14.00;
DELETE
FROM Fiction
WHERE PublisherCode='VB';
Updated by Adrian German for A348/A548