1
Conflict-free Replicate Data TypesDisclaimer: I'm thinking about an encrypted data store where CRDTs might come useful.
2
Notes from the paperState based solution: the update is applied locally. The state transferred and a merge function is applied.
State-based Convergent Replicated Data Types (CvRDT)
🤔 Differentiating Eventual Consistency and Strong Eventual Consistency. Eventual Consistency allows rollback to later replay operations and reach the same state. This is a very interesting concept but I think it's also kind of a merge behaviour where the merger can see back in time.
Operation-based Commutative Replicated Date Types (CmRDT)
Each operation can be broken up into 2 parts a side effect free prepare update , denoted by $t$ and _effect update part, denoted by $u$. $$op = t \bullet u$$
🤔 $t$ check the state. This determines the dependencies of the $u$ update part of the $op$ operation. The whole serialization depends on $t$. The question what we can do: 1. We can lock the impacted part of the state. That's what we do in the traditional SQL world. We are locking tables, row, indices. This requires to first evaluate $t$ and get and agreement that the interesting part of the state is not being touched by anyone else. Here comes the CAP theorem and it's not gonna allow us to work properly. 2. We can maintain a dependency list and use it to determine if the $op$ operation still can be executed on the other side. This solution needs some merging. What if the state diverges in another partition. 3. We can define $t$ (prepare) and $u$ (update) operations that can run in any partition in any order. In this case there is not need to execute $t$ on a remote replica.
Notes for myself
🤔 Vector clocks3 is actually what we are looking for. The question is how we can free up the state transitions that in all system we arrive to the same state. There are different solutions for that.
- We can have a merge solution
- We can limit the operations that those always arrive to the same
- We can serialize the operations and not allow anything else (somebody needs to be master)
I'm thinking about how it could be better described. I think CRDT is a very good work but I need a better way to describe the dependencies and have a merge resolution algorithm.