CS 33 (Spring 2010): Course Project

Deadline for Stage 1 (High-Level Design): 10:00pm sharp on May 3, 2010

Deadline for Stage 2 (Schema Design): 10:00pm sharp on May 10, 2010

Deadline for Stage 3 (SQL-Mania): 10:00pm sharp on May 17, 2010

Deadline for Final Report: 10:00pm sharp on Jun 2, 2010

[This page will be updated from time to time]

If you have made other arrangements for your course project with the instructor by April 26, amend the text below as appropriate.

Your course project will consist of conceiving, designing and implementing a moderate-sized database application for an enterprise of your choice If you choose a bank or a university, you will have to go significantly beyond (or depart significantly from) the design outlined in the textbook's examples. You will not need to implement an actual DBMS: you should use MySQL instead. You will need to implement an elegant and intuitive user-interface, which will almost surely be Web-based. At this point, you have already used a connector to work with a MySQL database while coding in a general-purpose language. The textbook has some basic coverage of the other relevant technologies that you might need: CGI, PHP and XML. We shall also discuss these briefly in class, at the appropriate time.

The stages of this project will be as follows:

High-Level Design: Identify the entities, attributes and relationships that your database will represent. Prepare a detailed E-R diagram showing this information. Also, at this stage, identify the potential users of the system, and what kinds of authorizations they should have to access or modify the data in your database.

Schema Design: Translate your E-R diagram into a schema for your relational database. Pay heed to normalization and be prepared to reason cogently about (1) the level of normalization of your database, and (2) any purposeful deviations from normal forms and your justification for such deviations. Also, at this stage, make a list of views that you might wish to create, based on the varying needs or different types of users.

SQL-Mania: Prepare a large collection of SQL statements for a number of natural queries and database updates that each of your user types might want to perform. Populate your database with a large amount of data and test your SQL statements. At this stage, you should spend some time designing intelligent ways to generate test data. Random is usually good, but purely random data may hide problematic performance-busting corner cases, so you will have to be cleverer than that.

The Application Layer: Provide a suitable intuitive interface (ideally, web-based) for each category of user that will be using your database. When appropriate, you may combine interfaces for multiple user types (as in, e.g., Dartmouth's Banner system). You should also provide an interface for appropriately authorized users to modify data (depending on your application, a web interface might not be the best for this). At this stage, you need not bother about users lying to your system about their identity.

Authentication: Finally, implement a web-based authentication system in preparation for a mock deployment of your entire database application. In other words, ensure that your users are not lying about their identities.

The choice of what enterprise to build an application for is up to you. Here are some suggestions to get you started. There is no obligation to pick from one of these, and in fact, you should not pick the hospital system, because one group had worked on this topic in Spring 2009.

Academics at a university. Your system will manage information about students, professors and staff at the university, plus courses, grades, prerequisites, majors, etc. This is only a starter list. Flesh this out as appropriate.
An airline. Manage information about a fleet of aircraft, flights, fuel purchases, customers, passengers, frequent flier miles, airports, etc. Again, this is only a starter list and should be fleshed out.
A hospital. You might want to interface with a toy health-insurance system to make this interesting and complex. There are interesting authorization issues that arise here, and if you decide to take up this topic, and you want to provide some realism to your project, you ought to talk to people at DHMC to understand these issues.

What to turn in (Stage 1)

By 5:00pm on Thu Apr 29, notify the professor and the TA about your chosen project topic. A simple email will suffice. No two teams can pick the exact same topic, so if you know early what you want to do, then notify us early and "reserve" the topic!

Also, in this email, designate one person your team as the "submission manager". All submissions related to the project will be handled by this person on behalf of their team.

By the Stage 1 deadline, submit an electronic version of (1) your E-R diagram and (2) a brief write-up giving information about potential users of your system.

During the next class, each team will present their high-level design to the entire class, using suitable presentation software. The length of the presentation should be 5-6 minutes.

What to turn in (Stage 2)

Describe the schema for your database in a suitably formatted document. I recommend using LaTeX to create a PDF (as in some of the homeworks).

Write an SQL script to create the tables in your database, declaring appropriate types for each attribute, and including the appropriate basic integrity constraints, such as primary key, foreign key, not null, etc. Your script should also create views that particular users will have access to. (It's likely that you will have to add some more views at a later stage of the project, but you should think through the design deeply so you already have close to a complete set of views.)

Run your SQL script within your team's database (and optionally populate your tables with some sample data). For instance, if you are the Red Team, your database is called red_db. Each team member has all privileges on their team's database, including a grant option.

Submit the two documents (the schema description, and the SQL script) as "Homework 5", using the homework submission form. Only the submission manager of each team needs to submit.

What to turn in (Stage 3)

Please put your queries in a ".sql" file annotated with plenty of comments explaining what each query is doing is what type of user would be executing these queries.

Populate the database (on sunapee) assigned to your team. You don't have to submit the data in any other form; we can look at your tables directly. But if you made changes to your schema from Stage 2, please submit an updated schema. Also, you may want to add comments after your schema explaining how you generated/obtained your data.

At least a couple of your tables should be very large; for most teams, this would mean something like 10,000 rows each. The data in your tables could be artificial, randomly generated, or scraped off the Internet.

Submit two documents (the SQL file with the queries, and an updated annotated schema) as "Homework 6", using the homework submission form. As before, only the submission manager of each team should submit.

What to turn in (Final Report)

Prepare a final project report, consolidating the information from Stages 1, 2, and 3 (you can freely reuse text/drawings from those three reports), and adding a summary of the work you did in Stages 4 and 5.

In your report, explain at a high level how your system is to be used, by the various intended users.

In your report, include a write-up of the most interesting/challenging problems you needed to solve in the course of your project.

Collect your report, your final presentation, and all of your code (and any relevant data) into a single file (use tar/gzip, or zip). Use an intuitive directory/folder tree, so that it's easy to navigate! Please include a suitable README file at the top level. Submit this file (exactly one file) as "Homework 8", using the homework submission form.