Thom's Blog

Store-then-reference

Failure pattern – Prevent dangling references

Context

Some operations need to write data to an external system, and write a reference to that data in another system (often a local database). Referencing data that does not exist is likely to be an invalid system state.

Prerequisites

It is acceptable to end up with “garbage” un-referenced data. The data should not be accessed without a reference.

Examples

Uploading a profile image on a social network.

Problem

How do we prevent dangling references if writing the data fails?

Solution

First store the data, then store the reference. Any updates to this data should be written separately, rather than overwriting the original, in an append-only manner.

This is similar to Multiversion Concurrency Control (MVCC) in databases, where instead of updating a row in place, a new version is written along with the associated transaction ID. This new version will not be read until that transaction ID is marked as committed.

This operation is naturally resumable. Garbage collection can be used to clean up stale, unreferenced data.

See also