Vector clock

Quality 0.50 · 1 view · Updated 2 months ago

Algorithm for partial ordering of events and detecting causality in distributed systems

title: "Vector clock" type: doc version: 1 created: 2026-02-28 author: "Wikipedia contributors" status: active scope: public tags: ["logical-clock-algorithms"] description: "Algorithm for partial ordering of events and detecting causality in distributed systems" topic_path: "philosophy" source: "https://en.wikipedia.org/wiki/Vector_clock" license: "CC BY-SA 4.0" wikipedia_page_id: 0 wikipedia_revision_id: 0

::summary Algorithm for partial ordering of events and detecting causality in distributed systems ::

A vector clock is a data structure used for determining the partial ordering of events in a distributed system and detecting causality violations. Just as in Lamport timestamps, inter-process messages contain the state of the sending process's logical clock. A vector clock of a system of N processes is an array/vector of N logical clocks, one clock per process; a local "largest possible values" copy of the global clock-array is kept in each process.

Denote VC_i as the vector clock maintained by process i, the clock updates proceed as follows: ::figure[src="https://upload.wikimedia.org/wikipedia/commons/5/55/Vector_Clock.svg" caption="Example of a system of vector clocks. Events in the blue region are the causes leading to event B4, whereas those in the red region are the effects of event B4."] ::

Initially all clocks are zero.
Each time a process experiences an internal event, it increments its own logical clock in the vector by one. For instance, upon an event at process i, it updates VC_{i}[i] \leftarrow VC_{i}[i] + 1.
Each time a process sends a message, it increments its own logical clock in the vector by one (as in the bullet above, but not twice for the same event) then it pairs the message with a copy of its own vector and finally sends the pair.
Each time a process receives a message-vector clock pair, it increments its own logical clock in the vector by one and updates each element in its vector by taking the maximum of the value in its own vector clock and the value in the vector in the received pair (for every element). For example, if process P_i receives a message (m, VC_{j}) from P_j, it first increments its own logical clock in the vector by one VC_{i}[i]\leftarrow VC_{i}[i]+1 and then updates its entire vector by setting VC_{i}[k]\leftarrow \max(VC_{i}[k], VC_{j}[k]), \forall k.

History

Lamport originated the idea of logical Lamport clocks in 1978. However, the logical clocks in that paper were scalars, not vectors. The generalization to vector time was developed several times, apparently independently, by different authors in the early 1980s. At least 6 papers contain the concept. The papers are (in chronological order):

The papers canonically cited in reference to vector clocks are Colin Fidge’s and Friedemann Mattern’s 1988 works, | place=Chateau de Bonas, France |date=October 1988 |publisher=Elsevier | pages=215–226}} as they (independently) established the name "vector clock" and the mathematical properties of vector clocks.

Partial ordering property

Vector clocks allow for the partial causal ordering of events. Defining the following:

VC(x) denotes the vector clock of event x, and VC(x)_z denotes the component of that clock for process z.
VC(x)
- In English: VC(x) is less than VC(y), if and only if VC(x)_z is less than or equal to VC(y)z for all process indices z, and at least one of those relationships is strictly smaller (that is, VC(x){z'} ).
x \to y; denotes that event x happened before event y. It is defined as: if x \to y;, then VC(x)

Properties:

Antisymmetry: if VC(a) , then ¬(VC(b)
Transitivity: if VC(a) and VC(b) , then VC(a) ; or, if a \to b; and b \to c;, then a \to c;

Relation with other orders

Let RT(x) be the real time when event x occurs. If VC(a) , then RT(a)
Let C(x) be the Lamport timestamp of event x. If VC(a) , then C(a)

Limitations under Byzantine failures

Vector clocks can reliably detect causality in distributed systems subject to crash failures. However, when processes behave arbitrarily or maliciously—as in the Byzantine failure model—causality detection becomes fundamentally impossible {{cite conference | last1 = Misra | first1 = Anshuman | last2 = Kshemkalyani | first2 = Ajay D. | title = Detecting Causality in the Presence of Byzantine Processes: There is No Holy Grail | book-title = 2022 IEEE 21st International Symposium on Network Computing and Applications (NCA) | year = 2022 | pages = 73–80 | doi = 10.1109/NCA57778.2022.10013644 | publisher = IEEE , rendering vector clocks ineffective in such environments. This impossibility result holds for all variants of vector clocks, as it stems from core limitations inherent to the problem of causality detection under Byzantine faults.

Other mechanisms

In 1984, Wuu and Bernstein described a technique extended from vector clocks known as Matrix Clocks. By maintaining a matrix where each row corresponds to the vector clock of a peer, processes can estimate the minimum knowledge held by all other nodes. This allows for the calculation of a "lower bound" on global progress, enabling the safe truncation of operation logs (garbage collection) in replicated databases.
In 1999, Torres-Rojas and Ahamad developed Plausible Clocks,{{Citation |author1=Francisco Torres-Rojas |author2=Mustaque Ahamad |title=Plausible clocks: constant size logical clocks for distributed systems |journal=Distributed Computing |volume=12 |issue=4 |year=1999 |pages=179–195 |doi=10.1007/s004460050065 |s2cid=2936350 |url=https://www.cc.gatech.edu/fac/Mustaque.Ahamad/pubs/plausible.ps|url-access=subscription }} a mechanism that takes less space than vector clocks but that, in some cases, will totally order events that are causally concurrent.
In 2005, Agarwal and Garg created Chain Clocks, a system that tracks dependencies using vectors with size smaller than the number of processes and that adapts automatically to systems with dynamic number of processes.
In 2008, Almeida et al. introduced Interval Tree Clocks.{{Citation | last1=Almeida | first1=Paulo | last2=Baquero | first2=Carlos | last3=Fonte | first3=Victor | contribution=Interval Tree Clocks: A Logical Clock for Dynamic Systems | title=Principles of Distributed Systems | volume=5401 | publisher=Springer-Verlag, Lecture Notes in Computer Science | year=2008 | doi=10.1007/978-3-540-92221-6 | url=http://gsd.di.uminho.pt/members/cbm/ps/itc2008.pdf | pages=259–274 | series=Lecture Notes in Computer Science | bibcode=2008LNCS.5401.....B | editor1-last=Baker | editor1-first=Theodore P. | editor2-last=Bui | editor2-first=Alain | editor3-last=Tixeuil | editor3-first=Sébastien | isbn=978-3-540-92220-9 }} This mechanism generalizes Vector Clocks and allows operation in dynamic environments when the identities and number of processes in the computation is not known in advance.
In 2019, Lum Ramabaja proposed Bloom Clocks, a probabilistic data structure based on Bloom filters. Compared to a vector clock, the space used per node is fixed and does not depend on the number of nodes in a system. Comparing two clocks either produces a true negative (the clocks are not comparable), or else a suggestion that one clock precedes the other, with the possibility of a false positive where the two clocks are unrelated. The false positive rate decreases as more storage is allowed.

Applications

Modern distributed systems utilize variations of vector clocks to enforce causal ordering of transactions without relying on a central wall-clock time. For instance, the Cerberus protocol uses logical clocks to track the state "version" of each shard. When a transaction spans multiple shards, the vector of these logical clocks allows the network to validate that the transaction is interacting with the most current state of all involved assets, enabling atomic composability in an adversarial environment.

References

"Distributed Systems 3rd edition (2017)".
(1978). "Time, clocks, and the ordering of events in a distributed system". [[Communications of the ACM]].
(March 1994). "Detecting causal relationships in distributed computations: In search of the holy grail". Distributed Computing.
Fidge, Colin J.. (February 1988). "Timestamps in message-passing systems that preserve the partial ordering".
(1984). "Proceedings of the third annual ACM symposium on Principles of distributed computing - PODC '84".
(17 July 2005). "Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing". Association for Computing Machinery.
(2008). "Interval Tree Clocks: A Logical Clock for Dynamic Systems".
(2014). "Background Preliminaries: Interval Tree Clock Results".
(1 April 2021). "Resettable Encoded Vector Clock for Causality Analysis With an Application to Dynamic Race Detection". IEEE Transactions on Parallel and Distributed Systems.
(2019). "The Bloom Clock".
(4 January 2022). "Proceedings of the 23rd International Conference on Distributed Computing and Networking".
(2021). "Cerberus: Minimalistic Multi-shard Byzantine-resilient Transaction Processing". Proceedings of the VLDB Endowment.

::callout[type=info title="Wikipedia Source"] This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page. ::