Data Consistency Models

A distinguishing factor between database providers and architectures can be the approach to data consistency.

Data consistency properties often fall into two models, known as ACID and BASE.

Each data consistency model allows for a different set of properties

Relational database management systems (RDBMS) are often developed with ACID principles at their core, adherence to these principles guarantees data consistency between relational data however as the user base grows with users being geo-diverse and availability demand increases, a requirement for a distributed data store is born.

While distributed data stores address required characteristics such as availability, they create challenges when it comes to data consistency making it difficult to adhere to ACID principles, this difficulty is best explained via the CAP Theorem which states that it is impossible to achieve both availability and consistency for network partitioned data stores.

Although network partitioned data stores cannot adhere to ACID principles, they do adhere to a different set of principles known as BASE which primarily focuses on availability.

Therefore ACID models favor consistency over availability while BASE models prefer availability over consistency.

Database systems that are built around BASE models are often referred to as non-relational Database Management Systems (DBMS) or Not Only SQL (NoSQL).

RDBMS are ACID compliant via transactions which are a sequence of database operations that are bundled as a single unit of work.

ACID properties are defined as:

Atomicity: Ensures that all operations within the transaction must succeed in order for the transaction as a whole to succeed.

Consistency: Transactions can only succeed by adhering to database-level business rules such as those imposed by constraints.

Isolation: Transactions are isolated from one another to maintain integrity, preventing one transaction from interfering with another.

Durability: Data committed via a transaction is guaranteed to be permanently preserved.

BASE properties are defined as:

Basic Availability: Achieved via data replication across multiple storage systems, it’s highly unlikely that all replicas will fail therefore there will are likely to be others

Soft State: As data is spread out across multiple replicas, it cannot be guaranteed that a read or write operation was performed on the most up-to-date state of data

Eventual Consistency: One replica may not contain the same data as others however a requirement of BASE is to synchronize data at some point, once synchronization is complete then data can become consistent across replicas, this process is known as convergence however data consistency is likely to be temporary.

The importance of data consistency is dependent upon the nature of data stored, consistency of financial data may be of higher importance than consistency of statistical data.