admin

Graduate students, I suggest you get started on your papers.

SQL Code: demo.sql soccer.sql soccer2.sql

SQL History

SQL Overview

SQL Command types

SQL DDL

Instructors setup your database

CREATE DATABASE someones_db;
 GRANT ALL ON someones_db.* TO someones IDENTIFIED BY 'qwerty123';
 GRANT ALL ON someones_db.* TO ccpalmer;
 GRANT ALL ON someones_db.* TO kizito;

That way only you and the instructors can USE your database. For example:

SHOW GRANTS FOR Peter;

Once you have a database, you connect to it.

$ ssh babylon7.thayer.dartmouth.edu
~ $ mysql --user=d29265d --password -h cosc061-ad12-db.c.dartmouth.edu cosc_61
mysql> SHOW TABLES;
+-----------------+
| Tables_in class |
+-----------------+
| college         |
| player          |
| tryout          |
| members         |
| . . .           |
+-----------------+
4 rows in set (0.00 sec)
mysql> ^D
$ 

That wouldn’t work for two OSU’s “Oklahoma State Univ” and “Ohio State Univ”.



SHOW CREATE TABLE college2;

Special Operators

 SELECT pName, HS
  WHERE HS
BETWEEN 299 AND 999;
/* */
 SELECT DISTINCT pName
   FROM player
  WHERE pName BETWEEN 'AAA' AND 'MMM';
/* */
 SELECT DISTINCT pName
  FROM player
 WHERE pName BETWEEN 'MMN' AND 'ZZZ';
/* */
SELECT cName
  FROM college
 WHERE state in ('TX','LA','OK','AR');
SELECT *
  FROM player
 WHERE HS < 1000;
/* */
SELECT *
  FROM player
 WHERE HS < 1000
   AND yCard = "yes";
	 /* */
SELECT *
  FROM player
 WHERE HS < 1000
   AND pName NOT LIKE ('G%');
  /* */
SELECT DISTINCT pName
  FROM player
 WHERE pName NOT BETWEEN 'AAA' and 'MMM';
SELECT DISTINCT pName AS city_player
  FROM player
 WHERE HS > 1000;
 SELECT COUNT(yCard) AS NumYellowCards
   FROM player
  WHERE HS > 999 AND yCard="yes";
 /* */
 SELECT AVG(HS) AS AverageHSSize,
        MAX(HS) AS MaxHS,
        MIN(HS) AS MinHS
   FROM player ;
-- How many goalie tryouts did each college have
 SELECT cName, COUNT(cName)
   FROM tryout
  WHERE pPos = "goalie" GROUP BY cName;
  
-- How many goalie tryouts did each college have 
 SELECT cName, COUNT(cName)
   FROM tryout
  WHERE pPos = "goalie"
  GROUP BY cName
 HAVING COUNT(cName) < 2;

Window Functions

Aggregate (e.g., SUM, AVG) and analytical (e.g., RANK) functions can also be limited to just some of the rows of a table. A Window function in SQL allows us to perform these *calculations* across a specific set of rows related to the current row. They’re called “window” functions because they can limit calculations to be made within the set of data defined by the window function.

Suppose you have a table of real estate sales in a county: county_sales

+---------+-------+------------+
| town    | month | sale_price |
+---------+-------+------------+
| Kenner  |     3 |     150000 |
| LaPlace |     4 |     198000 |
| Kenner  |     4 |     208000 |
| Gretna  |     3 |     195000 |
| Kenner  |     3 |     185000 |
| Spring  |     5 |     145000 |
| LaPlace |     5 |     130000 |
| LaPlace |     4 |     188000 |
| Kenner  |     4 |     188000 |
| Gretna  |     3 |     218500 |
| Kenner  |     3 |     180000 |
+---------+-------+------------+

You can use a window function to calculate the cumulative sum of the monthly home sales and places it in a new column Total Sales by Town :

  SELECT town, month, sale_price, 
         AVG(sale_price) OVER( PARTITION BY town ) 
            AS average_town_sales 
    FROM county_sales
ORDER BY TOWN;

Resulting in:

+---------+-------+------------+--------------------+
| town    | month | sale_price | average_town_sales |
+---------+-------+------------+--------------------+
| Gretna  |     3 |     195000 |        206750.0000 |
| Gretna  |     3 |     218500 |        206750.0000 |
| Kenner  |     3 |     150000 |        182200.0000 |
| Kenner  |     4 |     208000 |        182200.0000 |
| Kenner  |     3 |     185000 |        182200.0000 |
| Kenner  |     4 |     188000 |        182200.0000 |
| Kenner  |     3 |     180000 |        182200.0000 |
| LaPlace |     4 |     198000 |        172000.0000 |
| LaPlace |     5 |     130000 |        172000.0000 |
| LaPlace |     4 |     188000 |        172000.0000 |
| Spring  |     5 |     145000 |        145000.0000 |
+---------+-------+------------+--------------------+

Those zeros after the decimal place are annoying, so let’s fix them:

  SELECT town, month, sale_price, 
         ROUND((AVG(sale_price) OVER( PARTITION BY town )),0) AS total_town_sales 
    FROM county_sales
ORDER BY TOWN;

resulting in

+---------+-------+------------+------------------+
| town    | month | sale_price | total_town_sales |
+---------+-------+------------+------------------+
| Gretna  |     3 |     195000 |           206750 |
| Gretna  |     3 |     218500 |           206750 |
| Kenner  |     3 |     150000 |           182200 |
| Kenner  |     4 |     208000 |           182200 |
| Kenner  |     3 |     185000 |           182200 |
| Kenner  |     4 |     188000 |           182200 |
| Kenner  |     3 |     180000 |           182200 |
| LaPlace |     4 |     198000 |           172000 |
| LaPlace |     5 |     130000 |           172000 |
| LaPlace |     4 |     188000 |           172000 |
| Spring  |     5 |     145000 |           145000 |
+---------+-------+------------+------------------+

Here’s another example of Window fuctions from Julia Evans of WizardZines.com. The LAG( ) function is a window function that allows you to access data from a previous row in a result set from the current row without using a self-join.

mysql> select * from baby_log;

produces

+---------+------+
| event   | hour |
+---------+------+
| feeding |    1 |
| cough   |    1 |
| diaper  |    3 |
| feeding |    4 |
| diaper  |    5 |
| diaper  |    5 |
| feeding |    7 |
| cough   |    7 |
+---------+------+
8 rows in set (0.05 sec)

Then

SELECT event, hour,
       hour - LAG(hour) 
       OVER(
         PARTITION BY event
         ORDER BY hour ASC)
         AS time_since_last
  FROM baby_log
 WHERE event IN ('feeding', 'diaper')
 ORDER BY hour ASC;

produces

+---------+------+-----------------+
| event   | hour | time_since_last |
+---------+------+-----------------+
| feeding |    1 |            NULL |
| diaper  |    3 |            NULL |
| feeding |    4 |               3 |
| diaper  |    5 |               2 |
| diaper  |    5 |               0 |
| feeding |    7 |               3 |
+---------+------+-----------------+
6 rows in set (0.08 sec)

EXISTS operator

This operator is used to test for the existence of one or more records found by a subquery.

If we use the Northwind example database, the following query returns TRUE and lists the suppliers with a product price equal to 22:

mysql> SELECT CompanyName 
         FROM Supplier 
        WHERE EXISTS (SELECT ProductName 
                        FROM Product 
                       WHERE Product.SupplierID = Supplier.supplierID 
                         AND UnitPrice = 22);

+----------------+
| CompanyName    |
+----------------+
| Supplier VHQZD |
+----------------+
1 row in set (0.04 sec)

mysql> SELECT CompanyName, supplierID  FROM Supplier WHERE CompanyName = "Supplier VHQZD";
+----------------+------------+
| CompanyName    | supplierID |
+----------------+------------+
| Supplier VHQZD |          2 |
+----------------+------------+
1 row in set (0.04 sec)

mysql> SELECT ProductName, supplierId, unitPrice  FROM Product WHERE UnitPrice = 22;
+---------------+------------+-----------+
| ProductName   | supplierId | unitPrice |
+---------------+------------+-----------+
| Product KSBRM |          2 |     22.00 |
+---------------+------------+-----------+
1 row in set (0.04 sec)

mysql> EXPLAIN SELECT CompanyName FROM Supplier WHERE EXISTS (SELECT ProductName FROM Product WHERE Product.SupplierID = Supplier.supplierID AND UnitPrice = 22);
+----+--------------+-------------+------------+--------+---------------+---------+---------+------------------------+------+----------+-------------+
| id | select_type  | table       | partitions | type   | possible_keys | key     | key_len | ref                    | rows | filtered | Extra       |
+----+--------------+-------------+------------+--------+---------------+---------+---------+------------------------+------+----------+-------------+
|  1 | SIMPLE       | <subquery2> | NULL       | ALL    | NULL          | NULL    | NULL    | NULL                   | NULL |   100.00 | Using where |
|  1 | SIMPLE       | Supplier    | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | <subquery2>.SupplierID |    1 |   100.00 | NULL        |
|  2 | MATERIALIZED | Product     | NULL       | ALL    | supplierId    | NULL    | NULL    | NULL                   |   77 |    10.00 | Using where |
+----+--------------+-------------+------------+--------+---------------+---------+---------+------------------------+------+----------+-------------+
3 rows in set, 2 warnings (0.04 sec)

mysql>

SQL Order of operations

Understanding the order in which the parts of a SQL query are executed by a database is key to writing correct, efficient SQL code. SQL executes the parts of a SQL query in a sequence defined by the SQL standard. There may be some minor variations due to the SQL query optimization that takes place in most SQL databases, but those optimizations are guaranteed to produce the same result as the unoptimized version of the SQL query.

The order of operations for SQL is :

  1. FROM or JOIN
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. SELECT
  6. DISTINCT
  7. ORDER BY
  8. LIMIT or OFFSET

Here’s an example based on code at the geeksforgeeks website with the order of operations indicated as comments::

SELECT customer_ID, SUM(total_amount) AS "Total"        -- 5th
FROM orders																							-- 1st
WHERE order_date BETWEEN '2022-01-01' AND '2022-03-31'  -- 2nd
AND customer_city = 'New York'                          -- 2nd
GROUP BY customer_id                                    -- 3rd
ORDER BY Total DESC                                     -- 6th
LIMIT 100;                                              -- 7th

Commit

SLIDE 1 Revisit ACID first, then explain.

Most of the time you can assume that MySQL will automatically COMMIT changes due to INSERT, DELETE, UPDATE, CREATE TABLE, etc. commands. Generally don’t worry about it

Later when we’re talking about stored procedures in MySQL, we will revisit COMMIT and its counterpart ROLLBACK.

Changing a table’s data with UPDATE

We need a bigger table to illustrate UPDATE. Suppose we have this relation:

instructorRelation

 /* update one tuple */
 UPDATE instructor
    SET salary = 78000
  WHERE ID = 45565;
 /* update 2 attributes in one tuple  */
 UPDATE instructor
    SET salary = 90000,
    	  dept_name = 'Finance'
  WHERE ID = 45565;
 /* update multiple tuples */
 UPDATE instructor
    SET salary = salary * 1.05
  WHERE dept_name = 'Comp. Sci.';

Changing a table’s DESIGN

ALTER TABLE tablename
{ ADD | MODIFY } ( columnname datatype [ { ADD | MODIFY } columnname datatype] ) ;
ALTER TABLE tablename
  ADD constraint [ ADD constraint ] ;

Primary & Foreign Keys

We know keys have to be non-NULL and UNIQUE. Just add PRIMARY KEY (attr) or FOREIGN KEY (attr) REFERENCES relation(attr) at end of CREATE.

CREATE TABLE Tryout2
  ( pID INT REFERENCES Player(pid) ON DELETE SET NULL, 
	  cName VARCHAR(20) REFERENCES College(cName) ON UPDATE CASCADE, 
	  pPos VARCHAR(8),
    decision VARCHAR(3)
  );