The key reason I see permanency discussed is to answer two core questions:
- How do we manage the life cycle of an entity/entity set?
- How do we reflect an entity/ entity set phases of maturity for integration with outside applications? In simpler terms: How are applications to validate the entities are applicable to whatever operation the application wishes to proceed with?
Database Design Theory related to Permanency
- Authorization/Ownership- Some type of ownership exists over an entity/entity set handled outside the database and may be modeled through relationships with other entities/entity sets. Often deletes are handled through a limited process.
- Discriminating field(s)- the entity has fields used to filter records based on some criteria. Discriminators may be used to get around requirements which state data is permanent or least long lived.
- Atomicity- An entity set is normalized so portions represented as entity records can be deleted based atomic nature of each entity within a set. That is data that should be deleted or updated together is stored in the same entity. This is different from using fields for discrimination because the application makes decisions based on the existence of a record in a particular entity related to a entity under action for describe stages within a particular life cycle.
- Transformation- As data moves through the life cycle it is transformed from one entity set to another. The schema design is not necessarily the same from one set to another. Some form of data transformation may occur during the transition or it maybe a reflection.
Options for Managing Permanency
I mention SQL Server syntax in all caps though other technologies are similar in concept if not syntax.
- Using built in ON CASCADE DELETE- This is a good default if you can manage Authorization and Ownership. My usually recommendation is to default with this and opt in to other strategies on a case by case basis. A domain may never want to delete anything for particular entities. The suggestion here is to work up a justification for not Cascading. If you never plan to delete than everything that references that entity is also severely limited.
- ON CASCADE SET NULL/SET DEFAULT- Setting an foreign key field to NULL or to some default can be valid in certain cases. Maybe you have a 'Unassigned' record that allows a user to reassign after a delete to the new valid record.
- Analyze Dependencies there being none Delete - When a delete screen is used in this scenario the screen allows delete unless the entity is referenced in a particular way. Essentially there can be varying levels of what if analysis. A user can be allowed to override the analysis depending the scenario.
- Polling- You notify Owners of a pending delete and the delete proceeds after some criteria is met. Examples of criteria include delete on expiration of some timeout after notification or another possibility is unanimous consent by owners of entities that reference said data.
- Custom Triggers/Delete Routine- Whether there is a standardized strategy across the database or not, eventually you will have delete something. Maybe you manage the delete routine/script on an entity by entity. There is danger here if you standard is custom always or if the stakeholders are not making the call if you do.
- Soft Deletes- Using Transformation or Discriminators as described above allows data to disappear without an actual delete occurring. Danger is also here if you do not have consistency.