Transactions and Concurrency

Finish up the TRANSACTION LOG examples.

First, another assignment operator

From the MySQL Documentation:

:= Causes the user variable on the left hand side of the operator to take on the value to its right. The value on the right hand side may be a literal value, another variable storing a value, or any legal expression that yields a scalar value, including the result of a query (provided that this value is a scalar value). … Unlike =, the := operator is never interpreted as a comparison operator. This means you can use := in any valid SQL statement (not just in SET statements) to assign a value to a variable.

mysql> SELECT @var1 := 1, @var2;
        -> 1, NULL
mysql> SELECT @var1, @var2;
        -> 1, NULL
mysql> SELECT @var1, @var2 := @var1;
        -> 1, 1
mysql> SELECT @var1, @var2;
        -> 1, 1

SLIDE 11-1

ACID+S

The A,C, and D principles apply to all databases, while the I and the S only applies to multi-user databases (ok, or really busy single-user DBs).

What is serializability?

  • Basic Assumption – Each transaction preserves database consistency.
  • Thus, serial execution of a set of transactions preserves database consistency.
  • A (possibly concurrent) schedule is serializable if it is equivalent to a serial schedule.
  • Different forms of schedule equivalence give rise to the notions of:

    1. conflict serializability
    2. view serializability

SLIDE 11-2

Take a look at the Transaction log and see how it really works.

Concurrency control is all about making each user/client feel like their transaction(s) are running alone on the system.

Transactions can get in each other’s way, often without either knowing it!

Review the ways this can happen.

SLIDE 11-3..11-9

It’s quite a challenge to come up with a Scheduler that can determine the best way to interleave transactions without allowing these interferences.

The Scheduler determines the order in which the operations of the transactions are executed, aiming to maintain serializability while supporting concurrency.

The scheduler also tries to optimize the DB’s use of the host system services.

Example (based on MySQLTutorial’s examples)

START TRANSACTION;
-- get next order number
SELECT @ordernum:=MAX(orderNumber)+1 FROM orders;

-- insert a new order
INSERT INTO orders (orderNumber, orderDate, customerNumber)
  VALUES(@ordernum, '2022-11-01', 191);

-- insert the line items for the order
INSERT INTO orderdetails(orderNumber, productCode, quantityOrdered, priceEach,orderLineNumber)
  VALUES(@ordernum,'S19-2021', 10, '13', 1),
        (@ordernum,'Q42-77Z', 8, '9',2);

COMMIT;

-- Display the latest order
SELECT * FROM orders a INNER JOIN orderdetails b USING (orderNumber) WHERE a.orderNumber=@ordernum;

Let’s see how a ROLLBACK would work

START TRANSACTION;

-- count the number of orders
SELECT COUNT(*) FROM orders;

-- DELETE all the orders
DELETE FROM orders;

Then, from another session:

mysql> USE test;
mysql> [SELECT COUNT(*) FROM orders;]
+----------+
| count(*) |
+----------+
|        7 |
+----------+
1 row in set (0.05 sec)
mysql>

You will see the number of entries in the order table has not changed because the first session did not do a COMMIT. Returning to the first session

ROLLBACK;
-- and now we see all the orders are back
SELECT COUNT(*) FROM orders;

and now all the orders are back.

Concurrency using Locks

A lock guarantees exclusive use of the things it locks up. Things like:

  • a section of memory, a network adapter, the CPU cache, etc.

The size of the thing a lock can protect determines the granulatiry of that lock.

Locking can take place at the database, table, page, row, or even field level, depending upon the DB being used.

SLIDES 10-12..10-15

Field level locks are the lowest you can go … concurrent transactions can access the same row IFF they don’t access the same attribute (column). If you think of the DB table as a spreadsheet, this is invididual cell protection.

Smaller lock granualarity improves the concurrency of the database by reducing contention to lock database objects. However, smaller lock granularity also means that more locks must be maintained and managed by the DBMS, instead of by the operating system (e.g., page locks), requiring more processing overhead and system resources for lock management.

Concurrency demands and system resource usage must be balanced to ensure the best overall transaction performance. In some circumstances, page-level locks, which require fewer system resources, may produce better overall performance than field-level locks, which require more system resources.

Most DB implementations stop at row-level access.

MySQL Locks

Only to the Table level.

  • A READ lock on a TABLE may be held by multiple sessions at once, and other sessions can read the tables without a lock. READ locks do not allow WRITEs. (no surprise).
  • A WRITE lock on a TABLE may be held only by one session at a time, and only that session may READ from or WRITE to that TABLE.

You may have LOCKs or either type on multiple tables at once. For example, you could have a READ lock on some data table and a WRITE lock on some results table.

The UNLOCK TABLES instruction will release the TABLE lock. NOTE: It will release ALL the LOCKS the session has. It cannot release the LOCK on one specific TABLE.

If the program crashes or the connection to the server is severed, all LOCKS will be automatically UNLOCKed.

MySQL Locks demo

Continuing with the orders and orderdetails

-- -- -- Session A -- -- --
-- what's our connection number
SELECT CONNECTION_ID();
-- get next order number
SELECT @ordernum:=MAX(orderNumber)+1 FROM orders;
-- add another order
INSERT INTO orders (orderNumber, orderDate, customerNumber)
  VALUES(@ordernum, '2022-11-03', 292);
SELECT COUNT(*) FROM orders;
-- now add a READ lock
LOCK TABLE orders READ;
-- and try to write
-- get next order number
SELECT @ordernum:=MAX(orderNumber)+1 FROM orders;
-- add another order
INSERT INTO orders (orderNumber, orderDate, customerNumber)
  VALUES(@ordernum, '2022-11-04', 298);
-- ERROR

Now we switch to another session

-- -- -- Session B -- -- --
-- what's our connection number
SELECT CONNECTION_ID();
-- get next order number
SELECT @ordernum:=MAX(orderNumber)+1 FROM orders;
-- try to create another order
INSERT INTO orders (orderNumber, orderDate, customerNumber)
  VALUES(@ordernum, '2022-11-04', 400);
-- and we are suspended, waiting for the READ lock to be released

Let’s return to the first session and release the lock.

-- -- -- SESSION A -- -- --
-- Let's see what's happening
SHOW PROCESSLIST;
-- and we see the other session is waiting
-- Now count the orders and then release the lock
SELECT COUNT(*) FROM orders;
UNLOCK TABLES;
-- Let's see what's happening NOW
SHOW PROCESSLIST;
-- all's quiet, the other session should proceed
SELECT COUNT(*) FROM orders;

And that’s READ locks.

WRITE locks work as you might expect, except they prevent READs or WRITEs on the table until the LOCK is released.


Kinds of locks

You need to be able to TEST the lock, SET the lock, and OPEN the lock.

SLIDE 10-16

Binary Lock Example

  • Exclusive locks are just that - exclusive.

    These exist when access to a data item is specifically reserved for the transaction that locked the object. The exclusive lock must be used when a potential for conflict exists, e.g., when one or more transactions must update (WRITE) a data item. Therefore, an exclusive lock is issued only when a transaction must WRITE (update) a data item and no locks are currently held on that data item by any other transaction.

  • Shared locks are used with concurrent transactions are both allowed READ access and no exclusive locks on the resource exist.

Locks can operate in two modes:

  • exclusive (X) mode. The protected item can be read and written. lock_x
  • shared (S) mode. The protected items can only be read. lock_s

Either way, locks are requested from and managed by the DBMS.

When requesting a lock, the DBMS either grants the request or not based on what other locks already exist on the item. If it cannot grant the request, the requesting client is blocked on the lock. This is an important point.

Silberschatz summarrizes the valid lock transitions in a table:

  lock_s lock_x
lock_s OK NOT OK
lock_x NOT OK NOT OK

There can be zero or more locks on a given item, held by 1 or more clients.

A lock request may be granted if the request is compatible with other locks already on the item.

A simple example (also from Silbershatz):

T1 action
  lock_s(A)
  Read (A)
  unlock(A)
  lock_s(B)
  Read(B)
  unlock(B)
  print(A+B)

This seems good, but does it enforce serializability? No, since the update of A or B in between the read of A or B (respectively) could lead to an error.

Two-Phase locking (2PL)

This is an orderly way of acquiring and freeing locks.

SLIDE 10-16

Two phases

  • growing phase
    • a transaction acquires all the required locks without unlocking any data.
    • once all locks have been acquired, the transaction is in the locked state.
  • shrinking phase
    • a transaction releases all its locks, and cannot obtain a new one until all the locks are released.

In addition the 2PL protocol has three rules:

  1. Two transactions can not have conflicting locks.
  2. No unlock operation can precede a lock operation in the same transaction
  3. No data are affected until all locks are obtained - until the transaction is in the locked state.

Deadlocks

Now, just because you use locks doesn’t mean the resulting schedule is serializable … and locks can sometimes create DEADLOCKS or a deadly embrace.

A related challenge is starvation.

T1 T2
lock_x(B)  
read(B)  
B=B-1  
write(B)  
  lock_s(A)
  read(A)
  lock_x(B)
lock_x(A)  
  • as in T1 waiting for a lock_x(A) while many lock_s(A) requests are granted to a long line of other transactions.

A deadlock occurs when two transactions are waiting for each other to unlock data becore proceeding.

SLIDE 10-17

We have various techniques for preventing deadlocks.

  • avoidance schemes like 2PL work well.
  • detection schemes that periodically check the DB for deadlocks and aborts any deadlocked transactions and reschedules them.
  • prevention schemes wherein a transaction requesting a new lock is aborted if there is a chance that a deadlock would occur. All its previously obtained locks are freed and the transaction is rescheduled.

Time Stamping methods of concurrency control

A time stamping approach uses system wide, guaranteed unique time stamps to order transactions. These timestamps have two required properties:

  • unique
  • monotonic increasing

All the DB operations of a transaction must have the same timestamp. The DB executes transactions in strict timestamp order, ensuring serializability.

The disadvantage is that they introduce more overhead in both storage and computation, since two timestamps are required:

  • last time read.
  • last time written

wait/die and wound/wait schemes

SLIDE 10-18

In the wait/die scheme, the older transaction (lower timestamp) waits for the younger one to complete and release its locks.

  1. If the T requesting the L is older, it will wait for the younger T to complete and release its locks.
  2. If the T requesting the L is younger, it will ROLLBACK, giving up its locks, and be rescheduled.

In the wound/wait scheme, the older transaction rolls back the younger transaction and reschedules it.

  1. If the T requesting the lock is the older of the two, it will prempt (wound) the younger transaction by rolling it back. The younger T is rescheduled using the same time stamp.
  2. If the T requesting the ick is the younger of the two, it will wait until the other T is completed and its locks released.

Other factors affecting recovery

  • Deferred-write technique or deferred update.
    • Ensures that transaction logs are always written before the data are updated.
  • Redundant transaction logs.
    • Ensure that a physical disk failure will not impair the DBMS’s ability to recover data.
  • Buffers.
    • Temporary storage areas in a primary memory.
  • Checkpoints.
    • Allows DBMS to write all its updated buffers in memory to disk.

Checkpointing

Periodically a DB will stop processing transactions and write everything that’s pending to the disk. This is checkpointing, and it’s noted in the log.

Thus, after a failure the DB recovers by.

  1. find the previous checkpoint in the log.
  2. assume everything before that was written to disk.
  3. Go through and redo everything after that checkpoint.

Recovery examples from the text

Slide 10-19

  1. Find the last checkpoint (TRL ID 423)
  2. Since the second to last checkpoint, review the transactions
  3. T101 started & finished before checkpoint at 423, so it’s fine
  4. For each transaction committed after the last checkpoint, the DB wil use use the transaction log to write changes to storage, using the “after” values. Example for T106.
    a. find the associated COMMIT (TRL ID 457).
    b. Use the back-link to find the beginning of the transaction (TRL ID 397).
    c. Work through the steps to recreate the transaction’s effects.
    d. Do this for the remaining transactions committed after thelast checkpoint (T 155).
  5. Any other transactions that ended with Rollback or were still active when the crash occurred are ignored.