Spring Semester 2004


Consider the following problem:

You are interested in organizing a chess tournament. This is a big money opportunity so if you do it right you can retire a wealthy person. You're also in for the art of it, so you decide to make it a memorable event by inviting big names, and scheduling the events across the nation, in its finest venues, biggest halls, most exquisite establishments.

Here's the kind of data you will have to store:

Round one:

Jordan – Barkley 1 - 0 (Soldier Field)
Miller – Bird 0 - 1 (Madison Square Garden)
Kukoc – Duncan ½ - ½ (Wrigley Field)

Statistics (and general information) related to round one:

Round two:
Duncan – Barkley ½ - ½ (Woodburn Hall)
Kukoc – Bird 0-1 (Staples Center)
Miller – Jordan ½ - ½ (RCA Dome)

Round three:
Bird – Jordan 1- 0 (Rawles Arena)
Kukoc – Barkley 1-0 (Soldier Field)
Duncan – Miller ½ - ½ (Madison Square Garden)

Round four:

Miller – Barkley ½ - ½ (Wrigley Field)
Duncan – Bird 0-1 (Woodburn Hall)
Jordan – Kukoc 1-0 (Staples Center)

You need to design a database to store this data in it.

Here's a hint on how you might do it:

You need to store addresses for the players, venues, and customers, as well. Example:

username first name last name street address city state rate per match
cbarkley Charles Barkley Sir Charles Avenue Phoenix AZ $450.00
lbird Larry Bird Freedom Lick Rd. Naples FL $350.00
mjordan Michael Jordan Jordan Air Way Chicago IL $500.00
rmiller Reggie Miller Miller Time Road Indianapolis IN $150.00
tduncan Tim Duncan Duncan at the Tim Houston TX $210.00
tkukoc Toni Kukoc Petar Zrinski Street New Zagreb PA $50.00
Here's another example:
customer id logo (picture) address city state
Illinois IL
Iowa IO
IU Bloomington IN
Michigan MI
Purdue University IN
Stanford University CA
U of Minnesota MN
And the last example:
location address city state leasing rate capacity
Madison Square Garden Garden Center at the City New York NY $1,200.00 4,000
Rawles Arena Math Magic Alley Las Vegas NV $740.00 3,400
RCA Dome Hoosier Center Avenue Indianapolis IN $230.00 1,800
Soldier Field One Soldier Field Avenue Chicago IL $800.00 2,800
Staples Center One Big Office Supplies Ave. Los Angeles LA $500.00 3,200
Woodburn Hall Woodburn Avenue Atlanta GA $600.00 2,000
Wrigley Field Wrigley Shore Drive Chicago IL $800.00 1,200
You then need to write the following queries in SQL:

(But since most are provided you just need to

  1. check them and then
  2. reformulate them,
  3. explaining how the calculation goes,
  4. in English).

1. List all the players and their hometowns and rates.

select username, 
       [first name], 
       [last name], 
       city, 
       state, 
       [player rate per match]
from players;

2. List all the locations available for all events with the leasing rate.

List locations even if no games are scheduled at that location yet.

select location, 
       [leasing rate]
from venues

3. List all the players that charge more than $250 per match.

select username, [player rate per match] 
from players
where [player rate per match] > 250

4. List all the matches whose ticket prices are above $7.00

select [match id]
from matches
where [ticket price] > 7

5. List all the matches with number of tickets sold (per match).

select [match id], sum([how many tickets])
from enrollment
group by [match id]

6. List the most expensive player(s)?

select username
from players 
where [player rate per match] = (select max([player rate per match]) from players)
(Assume Bird and Barkley cost as much as Jordan).

7. What's the venue with the smallest capacity?

select location 
from venues 
where capacity = (select min(capacity) from venues);
(Same as above, assume Wrigley, Woodburn and RCA Dome are the same size).

8. List all the matches, number of tickets sold, and the capacity of the venue for each game.

select enrollment.[match id], sum([how many tickets]) as sold, capacity
from venues, matches, enrollment
where enrollment.[match id] = matches.[match id] and matches.[venue] = venues.[location]
group by enrollment.[match id], capacity
order by sum([how many tickets]) desc;

9. List all the matches and the percentage of seats they sold. What's the match that had the least percent of seats sold?

SELECT matches.[match id], 
       SUM([how many tickets]) / capacity AS percentage
FROM enrollment, matches, venues
WHERE enrollment.[match id] = matches.[match id] 
      AND venues.location = matches.venue
GROUP BY matches.[match id], capacity
HAVING SUM([how many tickets]) / capacity  = ( SELECT MAX (percentage)
                                               FROM ( SELECT matches.[match id], 
                                                             SUM([how many tickets]) / capacity AS percentage
                                                      FROM enrollment, matches, venues 
                                                      WHERE enrollment.[match id] = matches.[match id] 
                                                            AND venues.location = matches.venue
                                                      GROUP BY matches.[match id], capacity
                                                    )
                                             );
If you look closely you will notice a pattern.

10. What is the total cost of the players (per round)?

SELECT sum( [player rate per match] ), [date/round]
FROM players, participation, matches
WHERE players.username = participation.username and participation.[match id] = matches.[match id]
GROUP BY [date/round];

11. What's the average cost of player (per match)? (Does it matter that the roster could be bigger than the invited players?)

SELECT avg([player rate per match])
FROM players, participation
WHERE players.username = participation.username;

12. Which players cost above average? (Take the average from the prices listed in players).

SELECT username
FROM players
WHERE [player rate per match] > (select avg([player rate per match]) from players);

13. List the number of players that cost above average.

SELECT COUNT(*)
FROM ( SELECT username
       FROM players
       WHERE [player rate per match] > ( SELECT AVG([player rate per match]) 
                                         FROM players
                                       )
     )

14. List all universities and the number of games they bought tickets to.

select university, count([match id])
from enrollment 
group by university
(How would you identify the universities that bought tickets to ALL/NONE of the games?)

That's definitely an interesting question (with two possible answers).

15. List the games attended by Indiana University students.

SELECT [match id]
FROM enrollment
WHERE university = "IU Bloomington";

16. List the average attendance of Indiana Univ. students per game.

SELECT university, avg([how many tickets])
FROM enrollment
GROUP BY university
HAVING university="IU Bloomington";

17. List the total amount of money IU students paid for tickets.

SELECT sum([how many tickets] * [ticket price]), university
FROM enrollment, matches
WHERE enrollment.[match id] = matches.[match id]
GROUP BY university;
(Watch out, the query I gave you might just be slightly wrong.) Not here, in the other database.

18. Calculate the entire amount spent on players up to (and including) round four (not five).

SELECT sum([player rate per match]), [date/round]
FROM players, participation, matches
WHERE players.username=participation.username AND participation.[match id]=matches.[match id]
GROUP BY [date/round];

19. Calculate the entire amount collected on tickets sold to matches up to (and including) round four.

SELECT sum([ticket price]*[how many tickets]), [date/round]
FROM matches, enrollment
WHERE enrollment.[match id]=matches.[match id]
GROUP BY [date/round];
(What if the question were to ask up to an earlier round, or selected rounds?) Exactly.

20. Calculate the total amount spent on leasing the rooms.

SELECT sum([leasing rate])
FROM matches, venues
WHERE matches.[venue] = venues.[location];

21. Calculate the net profit up to (and including) round four.

(Difference between income from tickets and the sum of expenses with players and rent).

It's better to calculate (or, take a look at) the next one first.

22-23. Calculate the profit per match and round (two queries).

SELECT sum([per match]) AS total_revenue

FROM (
 
      SELECT result.[match id], 
             [tickets]-sum([player rate per match])-[leasing rate] as [per match]
      FROM ( 
             
             SELECT participation.[match id], 
                    players.[player rate per match], 
                    sum([ticket price]*[how many tickets]) as tickets,
                    venues.location,
                    venues.[leasing rate], 
                    matches.[date/round]
             FROM venues 
                  INNER JOIN (spectators 
                              INNER JOIN (players 
                                          INNER JOIN ( (matches 
                                                        INNER JOIN enrollment 
                                                        ON matches.[match id] = enrollment.[match id])
                                                        
                                                        INNER JOIN participation 
                                                        ON matches.[match id] = participation.[match id]) 
                                                                                              
                                          ON players.username = participation.username) 
                                                                 
                              ON spectators.[customer id] = enrollment.university) 
                          
                   ON venues.location = matches.venue
                                
             GROUP BY participation.[match id], 
                      players.[player rate per match], 
                      venues.location, 
                      venues.[leasing rate], 
                      matches.[date/round]
                  
           )  AS result
             
      GROUP BY [result].[match id], 
               result.tickets, 
               result.[leasing rate]

). AS another_result; 

24-25. Calculate the average profit per match and round (two queries).

The key is this query:
SELECT 
             participation.username,
             participation.[match id] as game, 
             players.[player rate per match], 
             venues.location, 
             venues.[leasing rate],
             matches.[date/round],
             sum([ticket price]*[how many tickets]) as [ticket income]
      
FROM    venues, spectators, players, matches, enrollment, participation 

WHERE  matches.[match id]=enrollment.[match id] 
     AND  matches.[match id]=participation.[match id] 
     AND  players.username=participation.[username] 
     AND  spectators.[customer id]=enrollment.university 
     AND  venues.location=matches.venue

GROUP BY  
                   participation.[match id], 
                   participation.username, 
                   players.[player rate per match], 
                   venues.location, 
                   venues.[leasing rate], 
                   matches.[date/round];
It relies on the following facts:
  1. Players charge the same regardless of game.
  2. Leasing rate is also not depending on the games.
  3. One game is held in one location only (no broadcasting, teleconferencing!)
Run this query. It gives us very useful information.

Using that we can calculate anything (if we treat this query as a nested named query).

Average profit per match is this:

select helper.game, sum(helper.[player rate per match]) as salaries, helper.[leasing rate], 
helper.[ticket income]

from (

        SELECT 
               participation.username,
               participation.[match id] as game, 
               players.[player rate per match], 
               venues.location, 
               venues.[leasing rate],
               matches.[date/round],
               sum([ticket price]*[how many tickets]) as [ticket income]
      
        FROM   venues, spectators, players, matches, enrollment, participation 

        WHERE  matches.[match id]=enrollment.[match id] 
          AND  matches.[match id]=participation.[match id] 
          AND  players.username=participation.[username] 
          AND  spectators.[customer id]=enrollment.university 
          AND  venues.location=matches.venue

        GROUP BY  
                  
                participation.[match id], 
                participation.username, 
                players.[player rate per match], 
                venues.location, 
                venues.[leasing rate], 
                matches.[date/round]

) as helper 

group by helper.game, helper.[leasing rate], helper.[ticket income]
Truly, the rest should be absolutely easy.

To prove it consider this development:

select 
   helper.game, 
   helper.[date/round], 
   - sum(helper.[player rate per match]) - helper.[leasing rate] + helper.[ticket income] as [net income] 
from (
  SELECT 
    participation.username,
    participation.[match id] as game, 
    players.[player rate per match], 
    venues.location, 
    venues.[leasing rate],
    matches.[date/round],
    sum([ticket price]*[how many tickets]) as [ticket income]
      
  FROM
    venues, spectators, players, matches, enrollment, participation 

  WHERE  matches.[match id]=enrollment.[match id] 
    AND  matches.[match id]=participation.[match id] 
    AND  players.username=participation.[username] 
    AND  spectators.[customer id]=enrollment.university 
    AND  venues.location=matches.venue

  GROUP BY  
    participation.[match id], 
    participation.username, 
    players.[player rate per match], 
    venues.location, 
    venues.[leasing rate], 
    matches.[date/round]
) as helper 
group by helper.game, 
      helper.[leasing rate], 
      helper.[ticket income], 
      helper.[date/round]
If you call this query something you reduce the problems to the practical exam.

26-27. List the matches and rounds with an above average profit (two queries).

28. How many students saw Bird playing?

(How many tickets were sold for games he was in?)

Start with this:
select username, [how many tickets], participation.[match id], university
from participation, matches, enrollment
where participation.[match id] = matches.[match id] and matches.[match id] =
enrollment.[match id]
It gives us the number of tickets (and customer that purchased them) per match and player.

Project away the [match id] and the university columns.

select username, [how many tickets]
from participation, matches, enrollment
where participation.[match id] = matches.[match id] and matches.[match id] = enrollment.[match id]
Group by username in the resulting view (and sum the number of tickets).

select username, sum([how many tickets]) as [total audience]
from participation, matches, enrollment
where participation.[match id] = matches.[match id] and matches.[match id] = 
enrollment.[match id]
group by username
Then only show the group that has (corresponds to) username lbird.
select username, sum([how many tickets]) as [total audience]
from participation, matches, enrollment
where participation.[match id] = matches.[match id] and matches.[match id] = 
enrollment.[match id]
group by username
having username='lbird'

29. Who was the most watched player?

That's immediate from the query above.

SELECT username, sum( [how many tickets] ) as [total audience]
FROM participation, matches, enrollment
WHERE participation.[match id]=matches.[match id] And matches.[match id]=enrollment.[match id]
GROUP BY username 
ORDER BY sum( [how many tickets]) DESC; 

30. How much money did each player bring?

Sort the result of this query descending by amount and ascending by player's name (alphabetically, that is).

The idea here is to
  1. calculate the revenue per player, customer and match, since ticket prices vary.
  2. Then once we have that we can sum the revenue across customers, grouped by [match id] and player.
  3. This would give us a total amount twice bigger than the actual amount of money received on tickets.
  4. (That's because a customer purchasing a ticket can see both players, so the revenue is divided by two.)

So we start with the most basic fact:
select username, matches.[match id], [ticket price], [how many tickets]
from participation, matches, enrollment
where participation.[match id] = matches.[match id] and matches.[match id] = enrollment.[match id]
One change produces the actual amount of money:
select username, matches.[match id], [ticket price] *  [how many tickets] as revenue 
from participation, matches, enrollment
where participation.[match id] = matches.[match id] and matches.[match id] = enrollment.[match id]
Now we can eliminate the [match id] and sum over all records, grouped by username.
select username, matches.[match id], sum( [ticket price] *  [how many tickets]) as [total revenue]
from participation, matches, enrollment
where participation.[match id] = matches.[match id] and matches.[match id] = enrollment.[match id]
group by username, matches.[match id]
So now we eliminate the [match id]
select username, sum( [ticket price] *  [how many tickets]) as [total revenue]
from participation, matches, enrollment
where participation.[match id] = matches.[match id] and matches.[match id] = enrollment.[match id]
group by username
order by sum( [ticket price] * [how many tickets] ) desc 
(But we don't forget that the amounts we obtain need to be divided by 2).

31. List the standings after the fourth round.

select username, sum(points)
from participation
group by username 
order by sum(points) desc
But also consider this:
select username, sum(points) 
from (
  select username, points, matches.[match id], [date/round]
  from matches, participation  
  where matches.[match id] = participation.[match id]
) 
where [date/round] in ('round one', 'round two', 'round three', 'round four') 
group by username 
order by sum(points) desc

32. Which matches only had IN students in the audience? (Purdue is in Indiana as is IU).

select distinct [match id]
from enrollment e 
where 'in' = ALL (select distinct state from spectators, enrollment 
                   where spectators.[customer id] = enrollment.university 
                     and enrollment.[match id] = e.[match id])

33. Which University did not purchase tickets to any game?

select [customer id]
from spectators s
where not exists (select university from enrollment e where e.university = s.[customer id])

34. Which University did not purchase tickets to the Bird-Jordan game?

select [customer id]
from spectators s
where not exists (select university 
                  from enrollment e 
                  where e.[match id] = 'bird-jordan' 
                    and e.university = s.[customer id])

35. List the Universities sorted by the number of games they purchased tickets to.

select count([match id]), university
from enrollment
group by university