Advanced Data Modeling
admin
The majority of the notes for this class came from the authors of the textbook.
Topics
- How entity clusters are used to represent multiple entities and relationships
- The characteristics of good primary keys and how to select them
- How to use flexible solutions for special data-modeling cases
Entity cluster
SLIDE 1
- Virtual entity type used to represent multiple entities and relationships in ERD
- Avoid the display of attributes to eliminate complications that result when the inheritance rules change
Primary keys
-
Single attribute or a combination of attributes, which uniquely identifies each entity instance
- Guarantees entity integrity
- Works with foreign keys to implement relationships
-
Natural keys
- Real-world identifier used to uniquely identify real-world objects
- Familiar to end users and forms part of their day-to-day business vocabulary
- Also known as natural identifier
- Used as the primary key of the entity being modeled
-
SLIDE 2 - Desirable Primary Key Characteristics
-
Unique
- The PK must uniquely define each entity instance.
- A PK must be able to guarantee unique values.
- It cannot contain nulls
-
Non-Intelligent
- The PK should not have embedded semantic meaning other than to uniquely
identify each entity instance.
- An attribute with embedded semantic meaning is better used as a descriptive characteristic of the entity than as an identifier.
- For example, a student ID of 650973 would be preferred over Smith, Martha L. As a primary key identifier.
- The PK should not have embedded semantic meaning other than to uniquely
identify each entity instance.
-
Doesn’t change over time
- if an attribute has semantic meaning, it might be subject to updates, which is why names do not make good primary keys.
- If Vicki Smith is the primary key, what happens if she changes her name to Jasmine?
- The primary key is subject to change, the foreign key values must be updated, thus adding to the database workload.
- Furthermore, changing a primary key value means that you are changing the identity of an entity
- In short, the primary key should be permanent and unchangeable.
-
Preferably numeric
- Unique values can be better managed when they are numeric, because the database can use internal routines to implement a counter-style attribute that automatically increments values with the addition of each new row.
- Database systems include the ability to use special constructs to obtain unique sequential identifiers.
-
Security and privacy compliant
- The selected primary key must not be composed of any attribute that might be considered a security risk or privacy violation.
- For example, using a Social Security number as a primary key and then EMPLOYEE table is not a good idea.
-
-
Using composite (multi-attribute) primary keys
- Identifiers of composite entities
- Each primary key combination is allowed once in M:N relationship
- Identifiers of weak entities
- a weak entity has a strong identifying relationship with the parent entity
- When used as identifiers of weak entities, represent a real-world object
that is:
- Existence-dependent on another real-world object
- Represented in the data model as two separate entities in a strong identifying relationship
- SLIDE 3 - The M:N Relationship between STUDENT and CLASS
- Identifiers of composite entities
-
Using Surrogate keys
- Primary key used to simplify the identification of entity instances are
useful when:
- There is no natural key
- Selected candidate key has embedded semantic contents, has security/privacy concerns, or is too long
- Require ensuring that the candidate key of entity in question performs
properly
- Use unique index and not null constraints
- SLIDE 4 - Data Used to Keep Track of Events
- Primary key used to simplify the identification of entity instances are
useful when:
Design Case 1: Implementing 1:1 Relationships
- Foreign keys work with primary keys to properly implement relationships in relational model
-
Rule
- Put primary key of the parent entity on the dependent entity as foreign key
-
Options for selecting and placing the foreign key:
- Place a foreign key in both entities
- Place a foreign key in one of the entities
- Slide 5 - Selection of Foreign Key in a 1:1 Relationship
- Slide 6 - The 1:1 Relationship between Department and Employee
Design Case 2: Maintaining History of Time-Variant Data
- Time-variant data:
- Data whose values change over time and for which a history of the data changes must be retained
- Requires creating a new entity in a 1:M relationship with the original entity
- New entity contains the new value, date of the change, and other pertinent attribute
- Examples from the text
- SLIDE 7 - Maintaining Salary History
- SLIDE 8 - Maintaining Manager History
- SLIDE 9 - Maintaining Job History
Design Case 3: Fan Traps
- These are design traps
- Occurs when a relationship is improperly or incompletely identified
- Represented in a way not consistent with the real world
- Most common example is a Fan trap:
- Occurs when one entity is in two 1:M relationships to other entities
- Produces an association among other entities not expressed in the model
- SLIDE 10 - Incorrect ERD with Fan Trap Problem
- SLIDE 11 - Corrected ERD After Removal of the Fan Trap
Design Case 4: Redundant Relationships
- SLIDE 12 - Redundant relationship
- Occur when there are multiple relationship paths between related entities
- Need to remain consistent across the model
- Help simplify the design