CS 33 (Spring 2011): Homework 1: Relational Algebra and SQL

Deadline: 10:00pm sharp on April 4, 2010

Please read through this page entirely and carefully, before beginning your work. All chapter and problem numbers refer to the Sixth Edition of the textbook.

Part 1: Using relational algebra

  1. Consider the following simple database of information about certain employees and companies for which they work.

    employee(ename, street, city) — records residential addresses
    company(cname, city) — records business location
    works(ename, cname, salary) — records employment and salary information

    Write relational algebra expressions that generate the desired output in each of the following cases. For parts (a) and (b), use only the fundamental relational algebra operations. Elsewhere, you may use any of the relational algebra operations described in Chapters 2 and 6.

    1. Output the names of all employees who work for the company "Oracle".

    2. Output the names and cities of residence of all employees who work for "Oracle".

    3. Output the names, street addresses and cities of residence of all employees who work for "Oracle" and earn more than $50,000.

    4. Output the names of all employees who live in the same city as the company for which they work.

    [3 + 3 + 3 + 3 points]

  2. Solve problems 6.4 (a) and (b) from the textbook.
    [3 + 5 points]

  3. Solve problem 6.12 (a) and (b) from the textbook.
    [3 + 5 points]

Part 2: Writing basic SQL queries

  1. Solve Problem 3.11 (a), (b) and (d) from the textbook. For part (b), you may assume that there are only two semesters each year — "Spring" and "Fall" — and that Spring is before Fall.
    [3 + 3 + 4 points]

  2. Solve all parts of Problem 3.15 from the textbook.
    [4 + 4 + 4 points]

  3. Referring to the same schema as in the above problem, write an SQL query to list all customers who have at most one account at the "Perryridge" branch.
    [5 points]

Part 3: Writing SQL scripts

  1. The file ~cs33/data/baseball_salaries_2003.txt contains salary information for certain professional baseball players from the year 2003. It should be self-explanatory. Write an SQL script that processes this file to determine, for each position, the average salary of the players in that position. Note that the seven positions represented in the input file are "Catcher", "First Baseman", "Outfielder", "Pitcher", "Second Baseman", "Shortstop" and "Third Baseman".

    Your script should create an appropriate table in your database and populate it using the data in the input file. It should then execute a single SQL query whose output has the schema (position, avg_sal). The output should appear sorted in descending order of average salary.

    [10 points]

  2. The file ~cs33/data/baseball_salaries_2005.txt is very similar to the above and contains data for 2005, but for a different set of teams. However, there are a number of players who appear in both data sets. Write an SQL script that determines the set of players that appear in both files, along with their 2003 and 2005 salaries, and the resulting percentage change in their salary.

    Your script should create two appropriate tables in your database, populate them from the input files, and then execute a single SQL query whose output has the schema (player_name, old_salary, new_salary, pct_delta). The output should compute the percentage change correctly and show the results sorted in ascending order of this change.

    [10 points]

What to turn in