Inferensys

Glossary

Conflict-Free Replicated Data Type (CRDT)

A Conflict-Free Replicated Data Type (CRDT) is a data structure designed for eventual consistency, allowing replicas to be updated independently and concurrently without coordination, and to be merged automatically.
Data scientist building training data pipeline on laptop, data preprocessing visible, technical workspace.
SELF-HEALING SOFTWARE SYSTEMS

What is a Conflict-Free Replicated Data Type (CRDT)?

A foundational data structure for building autonomous, resilient systems that can synchronize state without central coordination.

A Conflict-Free Replicated Data Type (CRDT) is a specialized data structure designed for eventual consistency in distributed systems, allowing independent replicas to be updated concurrently without coordination and to be merged automatically into a deterministic final state. This property is achieved through mathematical commutativity, associativity, and idempotence, ensuring that all operations converge regardless of the order they are received. CRDTs are a core enabler of self-healing software systems, providing the underlying state synchronization for autonomous agents and collaborative applications.

Common CRDT implementations include G-Counters for increment-only operations, PN-Counters for positive and negative increments, G-Sets for add-only collections, and LWW-Registers that use timestamps for last-write-wins resolution. These structures are fundamental to peer-to-peer applications, collaborative editing, and fault-tolerant agent design, as they allow subsystems to operate independently during network partitions and heal automatically upon reconnection. Their deterministic merge logic provides a robust foundation for agentic rollback strategies and iterative refinement protocols within autonomous systems.

SELF-HEALING SOFTWARE SYSTEMS

Key Properties of CRDTs

Conflict-Free Replicated Data Types (CRDTs) are foundational data structures for building eventually consistent, fault-tolerant distributed systems. Their design ensures replicas can operate independently and merge automatically, enabling self-healing architectures.

01

Commutativity

The core mathematical property that enables conflict-free merging. An operation is commutative if the order of application does not affect the final result. For CRDTs, all concurrent operations must be commutative, ensuring that replicas converge to the same state regardless of the order in which they receive updates.

  • Example: In a G-Counter (Grow-only Counter), two replicas each increment locally. The merge operation is addition (+), which is commutative: merge(1, 2) = 1 + 2 = 3 and merge(2, 1) = 2 + 1 = 3.
  • Non-Example: A standard list append operation is not commutative. If Replica A adds X and Replica B adds Y concurrently, the final sequence [X, Y] vs. [Y, X] depends on merge order, leading to conflict.
02

Idempotence

The property that applying the same operation multiple times has the same effect as applying it once. This is crucial for handling duplicate messages in unreliable networks. State-based CRDTs (CvRDTs) achieve this by designing merge functions that are idempotent, associative, and commutative.

  • Mechanism: The merge function compares two states and computes a least upper bound (LUB) in a partially ordered set. Applying the merge function with the same state repeatedly yields an identical result: merge(state, state) = state.
  • Practical Impact: It allows safe retransmission of updates without corrupting the final state, a key requirement for fault-tolerant communication in distributed systems.
03

Eventual Consistency

The guarantee that if all updates cease, all replicas will eventually converge to the same semantically correct value. CRDTs provide strong eventual consistency (SEC), a stronger guarantee than basic eventual consistency, as convergence is deterministic and does not require conflict resolution.

  • Convergence Time: Convergence is bounded only by network latency. In practice, with a stable network, replicas synchronize in < 1 second to a few seconds.
  • Contrast with Strong Consistency: Unlike strongly consistent systems (e.g., using Raft or Paxos), CRDTs do not require immediate coordination or a leader. This provides high availability and partition tolerance, aligning with the CAP theorem trade-off where Availability and Partition tolerance are prioritized over immediate Consistency.
04

Operation-Based vs. State-Based

The two principal designs for CRDTs, differing in what is transmitted between replicas.

  • Operation-Based CRDTs (CmRDTs): Transmit the operations (e.g., increment, add(element)). They require a reliable, ordered broadcast for non-commutative ops, which can be complex. Example: A PN-Counter (Positive-Negative Counter) sending increment() and decrement() messages.
  • State-Based CRDTs (CvRDTs): Transmit the entire state (or a delta). The merge function on the receiving side reconciles states. This is simpler as it only requires an unreliable channel (messages can be lost or duplicated, but convergence is still guaranteed). Example: Transmitting the full set of a G-Set (Grow-only Set).

Hybrid approaches (delta-state CRDTs) send compressed state changes to optimize bandwidth.

05

Monotonic Semilattice

The formal algebraic structure underpinning state-based CRDTs. The set of all possible states forms a join-semilattice: a partially ordered set where every pair of elements has a least upper bound (LUB).

  • Partial Order: Defines a "happened-before" relationship between states (e.g., a counter state of 5 is greater than a state of 3).
  • Merge Function as Join: The merge function computes the LUB (join) of two states, moving the state monotonically upward in the lattice. This guarantees convergence.
  • Example - Max-Value Register: The state is a single integer. The partial order is standard numeric order (). The merge function is max(a, b). The LUB of 3 and 5 is 5. The state can only increase, ensuring monotonic growth and conflict-free merges.
06

Common Types and Use Cases

Specific CRDT implementations solve common distributed data problems.

  • Counters:
    • G-Counter (Grow-only): For counting likes, views. Only increments.
    • PN-Counter (Positive-Negative): For scores, balances that can go up and down.
  • Registers:
    • LWW-Register (Last-Write-Wins): Uses timestamps to resolve concurrent writes to a single value. Common but can lose data.
  • Sets:
    • G-Set (Grow-only): For collaborative allow-lists. Elements only added.
    • 2P-Set (Two-Phase Set): Has an add set and a remove set for safe deletion.
    • OR-Set (Observed-Removed Set): The most practical general-purpose set. Tracks unique "tokens" for each add to allow safe removal of only observed elements.
  • Sequences & Text: (e.g., RGA - Replicated Growable Array, Logoot). Used in real-time collaborative editors like those in Google Docs or Figma, where concurrent inserts and deletes must be merged seamlessly.
SELF-HEALING SOFTWARE SYSTEMS

How Do CRDTs Work?

Conflict-Free Replicated Data Types (CRDTs) are foundational data structures for building resilient, eventually consistent distributed systems without centralized coordination.

A Conflict-Free Replicated Data Type (CRDT) is a data structure designed for eventual consistency, allowing independent replicas to be updated concurrently without coordination and merged automatically into a deterministic final state. This is achieved through mathematical properties like commutativity, associativity, and idempotence, ensuring that all operations can be applied in any order and merged without conflict. CRDTs are a core enabler of self-healing software systems, allowing autonomous agents to maintain synchronized state across partitions.

CRDTs operate through two primary designs: state-based (or convergent) and operation-based (or commutative). State-based CRDTs periodically transmit their entire state, merging via a monotonic join semilattice function that computes a least upper bound. Operation-based CRDTs transmit only the operations, which must be commutative to guarantee convergence. This architecture provides the fault tolerance and partition resilience required for systems like collaborative editors, distributed databases, and agentic memory backends where strict coordination is impossible.

SELF-HEALING DATA STRUCTURES

Common CRDT Examples and Use Cases

CRDTs are foundational to building eventually consistent, self-healing systems. Below are key types and their practical applications in distributed and autonomous software.

01

G-Counter (Grow-Only Counter)

A state-based CRDT that implements a distributed counter which can only be incremented. Each replica maintains a vector of counts (one per replica). The merge function takes the element-wise maximum, and the total count is the sum of all vector entries.

  • Key Property: Commutative and idempotent merges.
  • Use Case: Counting website page views, likes, or any monotonic metric across data centers where counts can't decrease.
  • Example: Replica A increments to [3,0], Replica B increments to [0,2]. Merge yields [3,2] for a total of 5.
02

PN-Counter (Positive-Negative Counter)

A state-based CRDT that supports both increments and decrements. It is implemented as two G-Counters: one for increments (P) and one for decrements (N). The total value is the sum of the P counter minus the sum of the N counter.

  • Key Property: Achieves a replicated up/down counter.
  • Use Case: Shopping cart item quantities, voting systems (upvotes/downvotes), or inventory tracking where items are added and removed concurrently.
  • Example: Replica A: P=[2,0], N=[1,0] (net=1). Replica B: P=[0,3], N=[0,1] (net=2). Merge yields P=[2,3], N=[1,1] for a total net of (5-2)=3.
03

G-Set (Grow-Only Set)

The simplest CRDT: a set that only supports add operations. The merge function is the set union, which is commutative, associative, and idempotent.

  • Key Property: Elements can never be removed, ensuring convergence.
  • Use Case: Logging unique events (e.g., user IDs that have seen a notification), collecting tags, or building an immutable append-only collection of items in a distributed system.
04

2P-Set (Two-Phase Set)

A set that supports add and remove operations without conflicts. It consists of two G-Sets: a add-set (A) for all added elements and a remove-set (R) for all removed elements. An element is present in the 2P-Set if it is in A and not in R.

  • Key Property: An element can only be removed once, and cannot be re-added after removal (tombstone).
  • Use Case: Distributed unique user session management, where a session is added on login and removed on logout, and re-adding the same session ID is not required.
05

OR-Set (Observed-Removed Set)

A op-based CRDT that supports add and remove operations and allows re-adding removed elements. Each element is tagged with a unique identifier (e.g., a UUID). To remove an element, you remove only the specific instances (tags) you have observed.

  • Key Property: Enables correct removal in the face of concurrent adds. This is the most commonly used set CRDT.
  • Use Case: Collaborative document editing (managing a list of characters or objects), distributed shopping carts where items can be re-added, and real-time collaborative to-do lists.
06

LWW-Register (Last-Write-Wins Register)

A state-based CRDT that implements a register (holds a single value) using timestamps or version vectors to resolve conflicts. The value with the greatest timestamp wins. Requires synchronized clocks or logical timestamps for determinism.

  • Key Property: Simple but requires a monotonic, globally meaningful timestamp source.
  • Use Case: Storing user profile fields (e.g., display name, status) where the latest update is acceptable, or caching a single configuration value across replicas.
  • Note: Can lose updates if clocks are skewed, making Multi-Value Register (MV-Register) a safer alternative when preserving all concurrent writes is necessary.
DATA CONSISTENCY MODELS

CRDTs vs. Other Consistency Approaches

A comparison of data consistency models and their trade-offs in distributed, self-healing systems, focusing on coordination, availability, and merge behavior.

Feature / CharacteristicConflict-Free Replicated Data Types (CRDTs)Strong Consistency (e.g., ACID, 2PC)Eventual Consistency (e.g., Dynamo-style)

Consistency Guarantee

Strong Eventual Consistency (SEC)

Immediate Consistency

Eventual Consistency

Coordination Required for Writes

Availability During Network Partitions

Automatic Merge/Conflict Resolution

Merge Operation Semantics

Deterministic, based on mathematical properties (commutative, associative, idempotent)

N/A - Requires serializable order

Last Write Wins (LWW) or requires application-level resolution

Typical Latency for Converged State

< 1 sec after message receipt

High (requires quorum/coordination)

Seconds to minutes, depends on anti-entropy

Data Type Flexibility

Specific mathematically-defined types (counters, sets, registers, graphs)

Any type (enforced by transactions)

Any type, but conflicts are likely

Ideal Use Case

Collaborative apps, offline-first systems, real-time sync, autonomous agent state

Financial transactions, inventory systems where correctness is paramount

User profile data, session caches, non-critical metadata

SELF-HEALING SOFTWARE SYSTEMS

Frequently Asked Questions About CRDTs

Conflict-Free Replicated Data Types (CRDTs) are foundational data structures for building eventually consistent, fault-tolerant distributed systems. This FAQ addresses common technical questions about their operation, guarantees, and practical applications in autonomous and self-healing architectures.

A Conflict-Free Replicated Data Type (CRDT) is a data structure designed for eventual consistency in distributed systems, allowing independent replicas to be updated concurrently without coordination and to be merged automatically into a deterministic, conflict-free final state.

CRDTs achieve this through mathematical properties that guarantee convergence. There are two primary families: State-based CRDTs (CvRDTs), which exchange their full state and merge it using a commutative, associative, and idempotent join function; and Operation-based CRDTs (CmRDTs), which broadcast commutative operations that are applied in a delivery-order-agnostic manner. This design makes them a cornerstone for fault-tolerant and self-healing systems where network partitions are expected, as replicas can continue operating offline and sync seamlessly upon reconnection.

Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.