CS 33 (Spring 2011): Homework 1

CS 33 (Spring 2011): Homework 1: Relational Algebra and SQL

Deadline: 10:00pm sharp on April 4, 2010

Please read through this page entirely and carefully, before beginning your work. All chapter and problem numbers refer to the Sixth Edition of the textbook.

Part 1: Using relational algebra

Consider the following simple database of information about certain employees and companies for which they work.

employee(ename, street, city)	— records residential addresses
company(cname, city)	— records business location
works(ename, cname, salary)	— records employment and salary information

Write relational algebra expressions that generate the desired output in each of the following cases. For parts (a) and (b), use only the fundamental relational algebra operations. Elsewhere, you may use any of the relational algebra operations described in Chapters 2 and 6.

Output the names of all employees who work for the company "Oracle".

Output the names and cities of residence of all employees who work for "Oracle".

Output the names, street addresses and cities of residence of all employees who work for "Oracle" and earn more than $50,000.

Output the names of all employees who live in the same city as the company for which they work.

[3 + 3 + 3 + 3 points]

Solve problems 6.4 (a) and (b) from the textbook.
[3 + 5 points]

Solve problem 6.12 (a) and (b) from the textbook.
[3 + 5 points]

Part 2: Writing basic SQL queries

Solve Problem 3.11 (a), (b) and (d) from the textbook. For part (b), you may assume that there are only two semesters each year — "Spring" and "Fall" — and that Spring is before Fall.
[3 + 3 + 4 points]

Solve all parts of Problem 3.15 from the textbook.
[4 + 4 + 4 points]

Referring to the same schema as in the above problem, write an SQL query to list all customers who have at most one account at the "Perryridge" branch.
[5 points]

Part 3: Writing SQL scripts

The file ~cs33/data/baseball_salaries_2003.txt contains salary information for certain professional baseball players from the year 2003. It should be self-explanatory. Write an SQL script that processes this file to determine, for each position, the average salary of the players in that position. Note that the seven positions represented in the input file are "Catcher", "First Baseman", "Outfielder", "Pitcher", "Second Baseman", "Shortstop" and "Third Baseman".

Your script should create an appropriate table in your database and populate it using the data in the input file. It should then execute a single SQL query whose output has the schema (position, avg_sal). The output should appear sorted in descending order of average salary.

[10 points]

The file ~cs33/data/baseball_salaries_2005.txt is very similar to the above and contains data for 2005, but for a different set of teams. However, there are a number of players who appear in both data sets. Write an SQL script that determines the set of players that appear in both files, along with their 2003 and 2005 salaries, and the resulting percentage change in their salary.

Your script should create two appropriate tables in your database, populate them from the input files, and then execute a single SQL query whose output has the schema (player_name, old_salary, new_salary, pct_delta). The output should compute the percentage change correctly and show the results sorted in ascending order of this change.

[10 points]

What to turn in

Prepare a single pdf file with your answers to all the problems in Part 1. One way to prepare the pdf file is to use LaTeX. If you are unfamiliar with LaTeX, you may copy the file ~cs33/data/hw1_example.tex and use it as a starting point: just edit it suitably. That file contains instructions to compile the ".tex" input file to produce a ".pdf" file: see the comments near the top. Though it takes some work to learn LaTeX, it is well worth it, and some past students have considered it one of the best investments they made as a CS major.

Prepare a single SQL file (extension .sql) that contains solutions to all the problems in Part 2. Include comments to tell us which command(s) solve which problems.

Prepare two more SQL script files, one for each of Problems 7 and 8. These scripts should be reading the input files from the ~cs33/data directory. We will change the data in the input files a few times as we test your submissions, so make sure you handle all cases. Here is how we shall run your scripts. If you prepare a script called hw1p7.sql for Problem 7, then we will run the following command, on the machine "sunapee":
```
      mysql -t -u uname -p dbname < hw1p7.sql 
```
We urge you to test your script using a similar command, many times, before you submit it.

You now have one .pdf file and three .sql files. Place all of these inside one zipped archive called, e.g., hw1.tgz or hw1.tar.gz or hw1.zip. Then, submit only this single zip file, using the homework submission form on the course website.