CS 33 (Spring 2011): Homework 2

CS 33 (Spring 2011): Homework 2: Advanced SQL and ODBC

Deadline: 10:00pm sharp on April 11, 2011

Please read through this page entirely and carefully, before beginning your work.

Part 1: Further SQL Problems

Refer to the bank database schema given in Figure 3.19 (page 107) in the textbook. Write a sequence of SQL data modification statements that achieve effect of closing the "North Town" branch of this bank and merging all its assets and liabilities into the "Perryridge" branch. You will need to update multiple tables. Your solution (a) should not assume anything about what cities "North Town" and "Perryridge" are located in, and (b) should not use the MySQL feature of user-defined variables.
[10 points]

Solve Problem 3.16 (d) and (e) from the textbook. For part (e), note that the size of the payroll of a company is defined to be the total salary it pays out.
[6 + 6 points]

Solve Problem 3.24 from the textbook.
[8 points]

Part 2: Connecting to a Database: ODBC and Equivalents

Consider a MySQL-based database that stores information about employees in an organisation and their reporting chain. Write code (in a general-purpose language) that connects to this database, gathers this information into a suitable tree-like internal data structure, and then writes some portion of (or all of) this information onto standard output, using indentation and layout to represent the reporting chain.

Specifically, your program should take three or four command line arguments, operating as follows. Say your prorgam is named "foo" and is invoked as follows (square brackets denote an optional argument):

$ foo uname pwd dbname [empname]

It should then connect, using username "uname" and password "pwd" to a database named "dbname" on the machine "sunapee.cs.dartmouth.edu", and gather information from the relation employee (name, role, manager). Then, it should print several lines to standard output, where each line gives the name of an employee, their role, and the total number of subordinates (direct or indirect) under them, in the format "%s: %s: %d subordinates" (in printf notation). If this employee has no subordinates, then the third field should be omitted. The subordinates of an employee should appear beneath that employee's name, indented using two leading spaces. This rule should be applied recursively: thus, second-level subordinates should appear below their respective immediate managers and should be indented using four leading spaces, etc. Further, all employees that are immediately under a particular manager should be listed in ascending alphabetical order of their names. Finally, the output should begin with the employee whose name is "empname". This fourth argument is optional; if it is missing, then the output should begin with the top-level employee, indicated by a NULL value in the manager field. In the latter case, if there is more than one top-level employee, then you should print all of them (together with their subordinates, as described above) in ascending alphabetical order of their names.
There is a detailed example in the file ~cs33/data/hw3_empdata.txt, which I urge you to read and understand, very carefully. You must follow the above explicit output formatting instructions as well as the implicit ones in that example.

[30 points]

Examine the following three files, which contain data about a set of imaginary actors and a set of imaginary films (movies) along with information about which actors appear in which films:
~cs33/data/actor.txt, ~cs33/data/film.txt, ~cs33/data/actor_film.txt

The comment lines on top of each file explain the contents adequately. Assume that the data from these have been loaded into the following tables in a MySQL database on the machine "sunapee.cs.dartmouth.edu":

actor (actor_ID, first_name, last_name)
film (film_ID, film_name, description, year)
actor_film (actor_ID, film_ID)

We would like to be able to merge two of the actors in this database into one: imagine that we have discovered, e.g., that "Matthew Johansson" is really a pseudonym of "Greg Chaplin". However, we should not carry out this merge operation if it turns out that both these actors have appeared in a common film: perhaps we trust our database enough that if this happens, we're sure they're not the same person.

Write code (in a general-purpose language) to perform the necessary updates on the database. Specifically, your program should read seven command line arguments as follows:

$ foo uname pwd dbname first1 last1 first2 last2

Here "foo" is the name of your program, "uname" and "pwd" are a username/password combination used to connect to the database "dbname", and "first1", "last1", "first2" and "last2" are, respectively, the first and last names of actor1 and the first and last names of actor2. Your program should then connect to "dbname", check if the aforementioned common film condition applies and then behave as follows. If the condition applies, it should print (on standard output) a complete list of film names in which both of the given actors appear, in increasing order of film ID. Else (the condition does not apply), it should print nothing and effectively rename actor1 to actor2 in the database, updating the tables actor and actor_film as necessary. If any errors occur (e.g., inability to connect to the database, or update it, or a non-existent actor name), then you should print an appropriate informative error message to standard output and not make any changes to the database.

[20 points]

What to turn in

Prepare one SQL script file containing solutions to Problems 1 through 3, in order, with comments interspersed indicating which problem you are solving. Use good indentation/layout so that your queries are easily readable. Follow the examples in the textbook.

Prepare one file with the source code for Problem 4, and another with the source code for Problem 5. Put comments at the top of the source code explaining how we should compile and execute it on "sunapee.cs.dartmouth.edu". If your code is in an interpreted language (such as Perl, Python, or Ruby), then compilation is not needed, but you should still explain how to execute it. If your code fails to compile or execute, you will not receive any credit, so please check that it does before submitting!

Combine the above three files into a single zip (or tar/gzip) archive, and submit it using the homework submission form.

Important Notes

Each of your queries must work correctly under the current version of MySQL on the machine "sunapee". Thus, you will be unable to use certain SQL features discussed in the textbook.

For grading purposes, we will change the data and run your code on various other examples. Do not rely on the exact data currently in "bankdb" or in the files in ~cs33/data/. You should test your queries thoroughly against suitably modified databases yourself, before submitting.