See files
-
-
Save eonist/7acf50e6fe86b737c1bbdcd2eaa3cc36 to your computer and use it in GitHub Desktop.
The "evo" project, which has been discontinued by its creator, represents an ambitious attempt to replace Git with a CRDT (Conflict-Free Replicated Data Type) based version control system. While the vision is compelling—automatic merge conflict resolution and stable file identities—the project faces several insurmountable technical limitations that prevent it from scaling compared to Git.1
The most critical limitation is unbounded memory growth. CRDTs require storing extensive metadata to enable automatic conflict resolution, and this overhead becomes prohibitive at scale.23
The Tombstone Problem: When text is deleted in a CRDT, the characters don't truly disappear—they become "tombstones" that persist forever in the data structure. This is necessary to correctly order incoming changes that reference deleted positions. For a version control system tracking thousands of commits across years, these tombstones accumulate indefinitely. As one researcher documented, "the size of the local memory in each node of the CRDT-based system grows continuously...this accumulation of historical data includes deleted content".42
For evo's RGA (Replicated Growable Array) CRDT specifically, each line operation requires storing:
- A lamport timestamp
- A node ID
- A line ID
- Parent dependencies
- The actual content (or old content for reverts)
This metadata overhead is typically 64 bytes minimum per operation, meaning a simple keystroke can consume 10KB or more when accounting for the full CRDT structure. In contrast, Git's delta compression can represent changes in just a few bytes.35
CRDTs suffer from non-linear performance degradation as repositories grow. Benchmarks show that while CRDT systems may start fast, they slow dramatically over time:56
- Automerge (a popular CRDT implementation) processes ~900 edits/second initially but slows to stalling V8 for 1.8 seconds on single edits in large documents5
- Peak memory usage can reach 2.6GB for just 260,000 edits5
- The "every operation matters forever" model means performance continually degrades
Git, by contrast, maintains relatively constant performance through:
- Packfiles with delta compression
- Shallow clones that don't require full history
- Efficient indexing using SHA-1 hashing
- Local operations that don't traverse entire history
The Linux kernel repository—with 1.4 million commits—occupies only 5.5GB while maintaining excellent performance. A CRDT-based system would struggle immensely with this scale.7
Git's snapshot-based model is fundamentally more efficient than operation-based CRDTs:897
Git's Advantages:
- Commits are immutable snapshots, not accumulated operations. Git doesn't need to replay every keystroke; it just stores compressed versions of file states
- Delta compression in packfiles dramatically reduces storage. Git only stores differences between similar objects
- SHA-1 content addressing provides efficient deduplication and integrity checking
- Garbage collection can truly remove unreferenced data
- Shallow operations allow working with recent history without loading everything
Evo's Disadvantages:
- Operation logs grow unbounded. Every single change ever made must be kept
- No garbage collection possible without breaking CRDT convergence guarantees
- All operations must be traversable to compute current state
- Metadata overhead of 64+ bytes per operation vs. Git's ~50 bytes per commit snapshot
- Binary format complexity doesn't solve the fundamental mathematical constraints
Evo's CRDT approach also faces challenges with distributed replication:3
- Naïve reconciliation requires multiple network roundtrips to fetch missing parent dependencies
- No efficient negotiation protocol like Git's sophisticated push/pull algorithms
- Each replica needs complete operation history rather than just commits
- Network overhead from transmitting verbose CRDT metadata
Git's negotiation protocol, refined over 20 years, efficiently determines what objects need transfer using algorithms like bitmap indexing and multi-ack protocols.
The promise of "zero merge conflicts" is somewhat misleading. CRDTs don't eliminate conflicts—they hide them through automatic resolution.1011121
For text editing, this might work (characters inserted concurrently simply appear in some order). But for source code:
- Automatic merges can create syntactically invalid code
- Semantic conflicts (two changes that individually work but together break functionality) are invisible
- Developers lose the opportunity to consciously resolve conflicts
As one developer noted: "you don't actually want your repo to be a CRDT because a CRDT resolves all conflicts and that would mean merge conflicts get resolved in an arbitrary way (leading to unexpected results and bad code)".11
Beyond technical limitations, evo faces adoption barriers:
- Incomplete implementation: The README itself notes features like "server-based PR flows," "performance optimizations," and "packfiles" were never completed1
- Binary compatibility: Git's ecosystem (GitHub, GitLab, Bitbucket, CI/CD tools) is deeply entrenched
- No migration path: Converting Git repositories to evo would be extremely difficult
- Team coordination: Git's branching model, while complex, is well-understood by millions of developers
Even more mature CRDT-based version control systems like Pijul face similar challenges:1311
- Still in alpha/experimental status after years of development
- Format changes and repository corruption issues damaged credibility
- Performance problems with certain merge scenarios
- Small ecosystem compared to Git's massive tooling infrastructure
Evo cannot scale versus Git because:
- Memory grows unbounded due to tombstones and operation logs
- Performance degrades non-linearly as history accumulates
- Network sync is inefficient compared to Git's protocols
- Storage overhead is 10-100x worse than Git's compressed snapshots
- No true garbage collection is possible without breaking CRDT guarantees
- The metadata-to-content ratio becomes increasingly unfavorable
Git's 20 years of optimization have produced a system finely tuned for version control specifically, with features like packfiles, partial clones, shallow history, and efficient network protocols. CRDTs solve a different problem (real-time collaborative editing) and force compromises that make them unsuitable for version control at scale.141523
The creator's decision to discontinue the project reflects these fundamental technical barriers rather than just the harassment mentioned in the README. While academically interesting, CRDT-based version control remains impractical for production use.1 1617181920212223242526272829303132333435363738394041424344454647484950515253545556
Yes, theoretically, every challenge identified can be overcome, but doing so requires moving beyond the "naive" CRDT implementation used in Evo.
The technical solutions exist in state-of-the-art research and newer projects (like Loro, Automerge 2.0, and Pijul), but implementing them essentially requires building a completely different system than what Evo currently is.
Here is how the specific limitations of Evo could be solved:
The Challenge: In Evo, every deleted character leaves a "tombstone" (metadata) behind so future merges can be placed correctly. This causes the repo to grow forever, even if you delete all files.
The Solution: Garbage Collection (GC) with Consensus You can delete tombstones, but only if you can prove that every other copy of the repository has also seen the deletion.
- How it works: The system tracks a "stability vector"—a list of which versions every user has synced. Once a deletion is "stable" (synced to everyone), the tombstone can be safely purged from disk.
- The Trade-off: This breaks the "pure offline" promise. You effectively need a "garbage collection synchronization" phase where peers talk to each other (or a central server) to agree on what can be deleted.
The Challenge: Evo's RGA (Replicated Growable Array) structure requires traversing a linked list or tree for every operation, which gets slower as history grows.
The Solution: Columnar Compression & RLE Newer libraries like Loro (written in Rust) and Automerge 2.0 use techniques from high-performance databases:
- Columnar Storage: Instead of storing objects like
{id: A, value: "a"},{id: B, value: "b"}, they store arrays:ids: [A, B],values: ["a", "b"]. This allows for massive compression. - Run-Length Encoding (RLE): If you type "hello", instead of 5 separate operations, the system compresses them into a single "insert 'hello' at position X" block.
- Result: Loro can load documents with millions of operations in milliseconds, whereas naive CRDTs (like Evo's) would hang for seconds or minutes.
3. Solving the "Hidden Conflict" Problem
The Challenge: Evo merges line-by-line. If you change a function name and I change a call to that function, Evo merges both without error, but the code won't compile.
The Solution: Semantic Conflict Detection Instead of treating code as "lines of text" (like Evo/Git), the VCS must treat code as an Abstract Syntax Tree (AST).
- Tree CRDTs: Systems like Pijul don't just merge text; they track "patches" and dependencies. If a patch depends on a specific context that has changed, Pijul flags a conflict instead of silently merging it.
- Syntactic Awareness: A "solved" Evo would need to parse the language (Go, JS, Rust) and understand that
func foo()is a definition, preventing merges that break the syntax tree.
The Challenge: Evo stores raw CRDT logs, which are 10-100x larger than the actual source code.
The Solution: "Squashing" or Snapshotting You can adopt a hybrid model (similar to how Git uses packfiles):
- Hot State: Keep the recent changes (last 2 weeks) in the full CRDT format to allow easy merging and time-travel.
- Cold State: "Squash" older history into immutable snapshots (like Git commits).
- The Trade-off: You lose the ability to easily "un-merge" or mathematically reorder commits from 5 years ago, but you gain massive storage efficiency. This is a compromise most production systems accept.
If someone were to reboot Evo today to make it viable, it would need to look like this:
| Feature | Evo (Current) | "Viable Evo" (Hypothetical) |
|---|---|---|
| Data Structure | Naive RGA (Line-based) | Columnar Loro/Automerge v2 |
| Storage | Infinite Log Growth | GC + Snapshotting |
| Conflict Model | "No conflicts" (dangerous) | Explicit Conflict States (Pijul-style) |
| Network | Raw Log Transfer | Bitmap Indexing (Git-style) |
Can it work? Yes. Pijul is the closest real-world attempt at this. It is mathematically sound, fast, and solves the merge conflict problem properly. The fact that Pijul exists proves the concept is valid, even if the Evo implementation failed. 5758596061626364656667686970717273747576
Footnotes
-
https://munin.uit.no/bitstream/handle/10037/33830/thesis.pdf?sequence=2\&isAllowed=y ↩ ↩2 ↩3
-
https://www.bartoszsypytkowski.com/crdt-authentication/ ↩ ↩2 ↩3 ↩4
-
https://www.geeksforgeeks.org/git/git-vs-other-version-control-systems-why-git-stands-out/ ↩
-
https://www.designveloper.com/blog/git-concepts-architecture/ ↩
-
https://initialcommit.com/blog/pijul-version-control-system ↩
-
https://www.codecentric.de/en/knowledge-hub/blog/concurrency-and-automatic-conflict-resolution ↩
-
https://highscalability.com/paper-crdts-consistency-without-concurrency-control/ ↩
-
https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type ↩
-
https://www.reddit.com/r/SoftwareEngineering/comments/18naw7m/making_crdts_98_more_efficient/ ↩
-
https://www.reddit.com/r/rust/comments/1msk8n1/need_help_exploring_crdts_and_implementing_them/ ↩
-
https://iankduncan.com/engineering/2025-11-27-crdt-dictionary/ ↩
-
https://www.reddit.com/r/rust/comments/kh706i/commutation_and_scalability_in_pijul_the_new/ ↩
-
https://www.reddit.com/r/datascience/comments/1gjh3w6/how_do_you_handle_version_control_at_a_high_level/ ↩
-
https://www.reddit.com/r/rust/comments/1gb3pdp/announcing_loro_10_a_highperformance_crdts/ ↩
-
https://discourse.pijul.org/t/patch-based-version-control/513 ↩
-
https://rhodecode.com/blog/156/version-control-systems-popularity-in-2025 ↩
-
https://munin.uit.no/handle/10037/34244?locale-attribute=en ↩
-
https://stackoverflow.blog/2023/01/09/beyond-git-the-other-version-control-systems-developers-use/ ↩
-
https://forum.solidproject.org/t/request-for-comments-crdtish-approach-to-solid/4211 ↩
-
https://en.wikipedia.org/wiki/Comparison_of_version-control_software ↩
-
https://docs.gitlab.com/administration/reference_architectures/ ↩
-
https://www.bartoszsypytkowski.com/operation-based-crdts-arrays-2/ ↩
-
https://stackoverflow.com/questions/6804443/how-much-overhead-time-and-learning-time-to-use-version-control ↩
-
https://mattweidner.com/2025/05/21/text-without-crdts.html ↩
-
https://stackoverflow.com/questions/4994772/ways-to-improve-git-status-performance ↩
-
https://stackoverflow.com/questions/53856808/when-using-2p-crdt-data-structures-for-example-2p-set-how-do-you-free-up-spac ↩
-
https://www.harness.io/harness-devops-academy/optimizing-git-performance-at-scale ↩
-
https://graphite.com/guides/git-monorepo-best-practices-for-scalability ↩
-
https://www.specbee.com/blogs/git-best-practices-how-make-most-git ↩
-
https://www.reddit.com/r/programming/comments/ov6ov3/5000x_faster_crdts_an_adventure_in_optimization/ ↩
-
https://garagehq.deuxfleurs.fr/documentation/design/internals/ ↩
-
https://munin.uit.no/bitstream/handle/10037/33830/thesis.pdf?sequence=2\&isAllowed=y ↩
-
https://www.reddit.com/r/rust/comments/kr4smo/reflecting_on_two_months_of_pijul_10alpha_pijul/ ↩
-
https://initialcommit.com/blog/pijul-version-control-system ↩
-
https://www.reddit.com/r/programming/comments/1dtdx6v/pijul_is_a_distributed_version_control_system/ ↩
-
https://stackoverflow.com/questions/52131250/how-does-git-perform-on-compressed-files ↩
-
https://www.linkedin.com/pulse/gits-delta-compression-algorithm-technical-deep-dive-maheshwari ↩
-
https://inkandswitch.github.io/automerge-rs/post/towards-production/ ↩
-
https://pages.lip6.fr/Marc.Shapiro/papers/icdcs09-treedoc.pdf ↩
-
http://archagon.net/blog/2018/03/24/data-laced-with-history/ ↩
The Evo project has been discontinued due to fundamental scalability and technical challenges inherent to CRDT-based version control, combined with community issues. While the vision was ambitious, several critical limitations prevent it from competing with Git at scale.
CRDTs suffer from continuous memory growth because they must retain all historical metadata to ensure convergence. In Evo's line-based RGA (Replicated Growable Array) implementation, each character or line requires a unique identifier and tombstone markers for deletions. This metadata accumulates indefinitely—even for deleted content—causing memory usage to balloon proportionally with edit history. Git avoids this by using delta compression and packfiles that efficiently store only meaningful diffs.12
CRDT operations have inherent performance penalties that worsen with repository size:2
- Upstream complexity (local edits): Linear O(N) time, requiring identifier generation and retrieval for every modification
- Position calculation: Must traverse tombstones to determine correct insertion points, degrading responsiveness below the critical 50ms threshold for interactive editing3
- Merge operations: While CRDTs eliminate merge conflicts conceptually, the computational cost of converging large documents with extensive edit histories becomes prohibitive
Git's snapshot-based model with three-way merges scales far better for large codebases, as evidenced by projects like the Linux kernel.
CRDTs cannot enforce application-level constraints because they use hardcoded merge rules. For version control, this means Evo cannot prevent semantically invalid merges—like concurrent edits that break code compilation or violate invariants. Git's explicit conflict resolution allows developers to apply domain knowledge, whereas CRDT automatic merging may produce "converged" but broken code.45
The accumulation of CRDT metadata creates cascading scalability problems:1
- Limited support for large datasets or growing user bases
- Increased bandwidth consumption for synchronization
- Storage inefficiency compared to Git's object database
- No equivalent to Git's shallow clones or partial checkouts
While Evo attempted large file support through stubs, the core CRDT structure still faces these fundamental constraints that Git's architecture inherently avoids.
The project's discontinuation reflects the practical reality that CRDT-based version control, while theoretically elegant, cannot match Git's proven scalability for real-world software development. 6789101112131415161718192021
Yes, many of these challenges can be partially overcome, though not completely eliminated. Active research has produced several promising mitigation strategies, though fundamental tradeoffs remain.
The memory growth problem has practical solutions. Researchers have demonstrated that safe garbage collection is possible when all replicas coordinate to prune obsolete metadata. Tombstone pruning algorithms can remove deleted content markers once all replicas acknowledge the deletion, reducing memory overhead by 40-60% in some implementations. However, this requires tracking replica states and introduces partial coordination—undermining CRDT's coordination-free advantage.222324
A promising approach involves offloading historical data to disk storage rather than keeping everything in memory. By applying partial persistence techniques, CRDTs can maintain recent operations in fast memory while archiving older metadata to disk. This addresses scalability constraints for large datasets while preserving convergence guarantees. The tradeoff is increased complexity and slower access to historical states.25
CRDTs can achieve significant memory savings through compressed representations of operations and metadata. Research shows that specialized encoding formats can reduce the space overhead substantially, though the optimal compression strategy remains an open problem.2627
Effective garbage collection requires knowing when all replicas have processed certain operations—which necessitates coordination. This creates a paradox: solving CRDT's scalability problems reintroduces the very coordination overhead CRDTs aim to eliminate. Dynamic environments where nodes join and leave unpredictably make this especially challenging.282324
The problem of enforcing semantic correctness remains fundamentally unsolvable with pure CRDTs. CRDTs guarantee convergence, not correctness. For version control, automatic merging cannot prevent code that compiles individually from becoming broken when combined. This isn't a performance issue—it's an inherent limitation of conflict-free approaches.29
While optimizations improve CRDT performance, tombstone traversal and metadata management still impose computational costs that grow with edit history. Real-time collaborative editing benchmarks show performance degradation over time, even with garbage collection.3022
A hybrid approach would likely work better than pure CRDT-based version control: using CRDTs for specific workflows (like concurrent documentation editing) while preserving Git's explicit conflict resolution for code where semantic correctness matters. The ConcoRDanT research project acknowledges that CRDTs are not a universal solution and work best when combined appropriately with other techniques.29
The challenges are surmountable for specific use cases, but replacing Git entirely remains impractical given the fundamental tradeoffs between coordination-free operation and semantic guarantees. 3132333435363738394041
Footnotes
-
https://munin.uit.no/bitstream/handle/10037/33830/thesis.pdf?sequence=2\&isAllowed=y ↩ ↩2
-
https://pages.lip6.fr/Marc.Shapiro/papers/rgasplit-group2016-11.pdf ↩ ↩2
-
https://members.loria.fr/CIgnat/files/pdf/AhmedNacerDocEng11.pdf ↩
-
https://www.sciencedirect.com/science/article/pii/S0167739X22004186 ↩
-
https://repositorio.inesctec.pt/bitstreams/c7fa58dc-20bd-4061-9437-e718b7686195/download ↩
-
https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type ↩
-
https://ably.com/blog/crdts-distributed-data-consistency-challenges ↩
-
https://highscalability.com/paper-crdts-consistency-without-concurrency-control/ ↩
-
https://iankduncan.com/engineering/2025-11-27-crdt-dictionary/ ↩
-
http://www.diva-portal.org/smash/get/diva2:1441174/FULLTEXT01.pdf ↩ ↩2
-
http://soft.vub.ac.be/Publications/2019/vub-soft-poster-19-02.pdf ↩ ↩2
-
https://cris.vub.be/ws/portalfiles/portal/75725375/official_paper_final.pdf ↩ ↩2
-
https://munin.uit.no/bitstream/handle/10037/33830/thesis.pdf?sequence=2\&isAllowed=y ↩
-
https://nacrooks.github.io/bibliography/publications/2023-vldb-calm.pdf ↩
-
https://jakelazaroff.com/words/an-interactive-intro-to-crdts/ ↩
-
https://members.loria.fr/CIgnat/files/pdf/AhmedNacerDocEng11.pdf ↩
-
https://www.linkedin.com/pulse/understanding-garbage-collection-mark-sweep-algorithm-elham-moharrami-exhkf ↩
-
https://christophermeiklejohn.com/erlang/lasp/2019/03/08/monotonicity.html ↩
-
https://www.bartoszsypytkowski.com/the-state-of-a-state-based-crdts/ ↩
-
https://repositum.tuwien.at/bitstream/20.500.12708/198345/1/Goronjic Valentin - 2024 - CRDT-based serverless middleware for stateful objects...pdf ↩
-
https://stackoverflow.com/questions/53856808/when-using-2p-crdt-data-structures-for-example-2p-set-how-do-you-free-up-spac ↩
-
https://seed.hyper.media/blog/seed-hypermedia-bringing-scalable-collaboration-to-the-decentralized-web ↩
-
https://www.synthesia.io/post/scalable-architecture-collaborative-video-editor ↩
The Evo project has been discontinued by its developer due to community harassment, but examining its technical architecture reveals fundamental scalability challenges that would have emerged regardless.
The most severe limitation of Evo's CRDT-based approach is unbounded metadata accumulation. RGA (Replicated Growable Array) CRDTs must store metadata for every operation ever performed—each line insertion, deletion, and update maintains tombstones and positional information that never gets cleaned up. In practice, this means:12
- A 1MB text file edited 1000 times could require 5-16MB of metadata storage
- Memory consumption grows linearly with edit history, not just current file size
- Unlike Git's packfiles which compress delta chains efficiently, CRDT operation logs accumulate indefinitely
RGA CRDTs exhibit non-uniform performance characteristics that worsen as repositories grow:34
- Indexing operations require O(n) linear searches through operation histories
- Each merge operation must replay all CRDT operations from scratch
- The slowest operations in benchmarks can stall for 1.8+ seconds on moderately-sized documents
- Git's packfile compression and delta storage vastly outperform CRDT metadata for large histories
While Git repositories typically grow sub-linearly due to delta compression, Evo faces the opposite problem:5
- Every concurrent edit creates permanent operation records
- Binary format for `.evo/ops/<stream>/<fileID>.bin` stores full operational metadata
- Large file handling with stubs doesn't solve the metadata bloat problem
- A repository with millions of historical operations becomes unwieldy
Git stores snapshots with delta compression applied at the storage layer, not the data structure layer. This means:5
- Git can discard intermediate states and reconstruct files efficiently
- Packfiles compress similar objects together regardless of operational history
- Repository size grows with unique content, not edit count
Git benefits from decades of optimization for large-scale projects:
- Shallow clones, sparse checkouts, and partial clones reduce data transfer
- Pack files are highly optimized with sophisticated compression algorithms
- The Linux kernel (10M+ commits, 30+ years) demonstrates Git's scalability ceiling
The "zero merge conflicts" promise of CRDTs comes with unavoidable costs:6
- CAP theorem constraints: CRDTs choose availability over strong consistency, making them unsuitable for scenarios requiring precise control
- Semantic conflicts remain: While syntactic conflicts are eliminated, logical conflicts (like incompatible refactorings) still require human resolution
- Garbage collection complexity: Safely pruning historical CRDT operations in distributed systems is an unsolved research problem
Evo's CRDT-based approach fundamentally trades Git's efficient snapshot-and-compress model for automatic convergence properties. This results in unbounded memory growth, performance degradation over time, and repository bloat that make it unsuitable for large-scale or long-lived projects. While CRDTs excel at real-time collaborative editing (like Google Docs), version control demands different trade-offs where Git's battle-tested architecture remains superior. 789101112131415161718192021
Some of these challenges can be partially mitigated, but fundamental trade-offs remain unavoidable. Here's what's possible and what isn't:
The tombstone accumulation problem has proven solutions with specific constraints:2223
- Coordinated GC: Systems like Garage successfully delete tombstones when all nodes confirm they've seen the deletion, preventing "zombie" values from reappearing
- Lossless compaction: Operations that no longer affect the final state can be dropped while preserving convergence properties24
- Time-based pruning: After a guaranteed synchronization window, old tombstones can be safely removed
However, these approaches require either centralized coordination (defeating CRDT's decentralized promise) or strict time guarantees that don't exist in truly offline-first scenarios.
The most promising direction involves combining CRDTs with snapshots:2526
- Periodically create Git-style snapshots of converged state
- New replicas clone from snapshots rather than replaying full operation history
- Only recent operations need CRDT metadata
- Once all replicas sync past a snapshot, earlier operations can be discarded
This approach essentially makes CRDTs an optimization layer on top of traditional version control, gaining conflict-free merging for recent work while maintaining Git's scalability.
Modern CRDT implementations achieve significant speedups:27
- Specialized data structures can make sequence CRDTs 6000x faster than naive implementations
- Run-length encoding compresses consecutive operations
- Efficient indexing reduces O(n) searches
These optimizations make CRDTs practical for documents with thousands of edits, but they don't eliminate the fundamental metadata overhead.
CRDTs face an inescapable mathematical constraint: to guarantee convergence without coordination, they must retain enough information to correctly order any future operation. This creates fundamental limits:2829
- True decentralization requires permanent metadata retention
- Any pruning strategy either requires coordination or risks divergence
- The CAP theorem guarantees you cannot have partition tolerance, availability, AND consistent garbage collection simultaneously
"Zero merge conflicts" is misleading—CRDTs only eliminate syntactic conflicts:30
- If two developers refactor the same function differently, CRDT will merge both changes, producing broken code
- Logical inconsistencies (incompatible API changes, race conditions) require human resolution
- Git's merge conflicts often surface semantic issues that CRDT silently combines into incorrect code
Unlike Git where repository size scales with unique content, CRDT-based systems face history-dependent growth:
- A file rewritten 1000 times carries metadata for all 1000 versions
- Git's packfiles compress similar files regardless of lineage
- Compaction techniques help but cannot match snapshot-based compression ratios
The most viable approach combines both paradigms:2425
For Active Development:
- Use CRDTs for conflict-free merging during active collaboration
- Provide automatic convergence for concurrent edits
- Maintain full operation history for recent work
For Long-term Storage:
- Periodically snapshot converged state using Git-style compression
- Prune old CRDT operations once all replicas sync
- Keep traditional branching/merging for released versions
This hybrid approach acknowledges that CRDTs and Git solve different problems: CRDTs excel at real-time convergence while Git excels at efficient long-term storage. Trying to replace Git entirely with CRDTs fights against fundamental computer science trade-offs, but using CRDTs as a collaboration layer on top of Git-style storage could offer the best of both worlds. 313233343536373839404142
Footnotes
-
https://munin.uit.no/bitstream/handle/10037/33830/thesis.pdf?sequence=2\&isAllowed=y ↩
-
https://blog.kevinjahns.de/are-crdts-suitable-for-shared-editing ↩
-
https://www.sciencedirect.com/science/article/pii/S0167739X22004186 ↩
-
https://repositorio.inesctec.pt/bitstreams/c7fa58dc-20bd-4061-9437-e718b7686195/download ↩
-
https://www.reddit.com/r/AskProgramming/comments/ndtjge/is_using_a_version_control_system_other_than_git/ ↩
-
https://dev.to/foxgem/crdts-achieving-eventual-consistency-in-distributed-systems-296g ↩
-
https://highscalability.com/paper-crdts-consistency-without-concurrency-control/ ↩
-
https://iankduncan.com/engineering/2025-11-27-crdt-dictionary/ ↩
-
https://garagehq.deuxfleurs.fr/documentation/design/internals/ ↩
-
http://archagon.net/blog/2018/03/24/data-laced-with-history/ ↩ ↩2
-
https://mwhittaker.github.io/papers/html/maddox2016decibel.html ↩
-
https://github.com/ipfs-inactive/dynamic-data-and-capabilities/issues/2 ↩
-
https://iankduncan.com/engineering/2025-11-27-crdt-dictionary/ ↩
-
https://decomposition.al/CMPS290S-2018-09/2018/11/12/implementing-a-garbage-collected-graph-crdt-part-1-of-2.html ↩
-
https://blog.helsing.ai/dson-a-delta-state-crdt-for-resilient-peer-to-peer-communication-7823349a042c ↩
-
https://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-2.pdf ↩
-
https://docs.riak.com/riak/cs/2.1.1/cookbooks/garbage-collection/index.html ↩
-
https://members.loria.fr/CIgnat/files/pdf/AhmedNacerDocEng11.pdf ↩
-
https://p2panda.org/2025/08/27/notes-convergent-access-control-crdt.html ↩
-
https://munin.uit.no/bitstream/handle/10037/33830/thesis.pdf?sequence=2\&isAllowed=y ↩
Evo is a discontinued project that attempted to use CRDT (Conflict-Free Replicated Data Type) technology for version control, but faces fundamental scalability and performance problems that make it unviable compared to git.
Memory Overhead Growth CRDTs suffer from inherent memory inflation because they must retain metadata for all operations to ensure convergence. Evo's RGA (Replicated Growable Array) CRDT assigns each line a unique identifier with lamport timestamp and node ID. As repositories grow, this metadata accumulates indefinitely—every line ever inserted, even if later deleted, leaves permanent tombstones in the operation log. Git avoids this by storing snapshots rather than preserving full operation history.123
Performance Degradation RGA-based CRDTs have linear upstream complexity, meaning edit operations slow down proportionally with document size. While Evo claims to use binary format for speed, research shows RGA implementations perform poorly in certain patterns—particularly when content is prepended rather than appended. In large repositories with hundreds of thousands of operations, individual edits can stall for multiple seconds.456
Storage Inefficiency
Evo stores operation logs per file (`.evo/ops/<stream>/<fileID>.bin`) rather than git's efficient pack files. Each line operation requires storing identifiers, timestamps, node IDs, and content. Research indicates CRDT-based collaborative editing systems create 16+ bytes of metadata per content string, causing storage to grow significantly faster than git's compressed deltas.[^1_5][^1_4]
Repository Size Git repositories with millions of commits and files remain performant because git uses content-addressable storage with delta compression and packed references. Evo's CRDT logs grow unbounded with every edit to every file across all streams, making Linux-kernel-sized repositories (10+ million commits) effectively impossible.23
Merge Performance While Evo promises "zero merge conflicts," this comes at severe cost. Merging in Evo requires replaying all CRDT operations from source to target stream, with O(N) complexity where N is the total number of operations. Git's three-way merge algorithm operates on snapshots, making merges independent of total repository history size.4
Garbage Collection Challenges CRDTs fundamentally cannot discard historical operation metadata without breaking convergence guarantees. Git regularly packs, prunes, and optimizes history. Evo's promise of "renames made simple" via stable UUIDs means file IDs persist forever, even for files deleted years ago.12
The project was discontinued not just due to community issues, but because the CRDT approach creates insurmountable scalability problems for version control at git's scale.3 7891011
Some challenges can be partially addressed, but fundamental tradeoffs remain that make pure CRDT approaches difficult for version control at git's scale.
The memory growth problem from tombstones can be mitigated through stability-based garbage collection. Once all replicas have received an operation, tombstones become unnecessary and can be pruned using vector clocks to track synchronization. Projects like Yorkie have implemented this successfully for collaborative editing. However, this requires knowing when operations are "stable" across all replicas—problematic for offline-first version control where contributors may remain disconnected for months.121314
Memory overhead can be reduced by offloading historical operations to disk storage rather than keeping everything in RAM. This "partial persistence" approach maintains recent operations in memory while archiving older ones, similar to how databases handle transaction logs. This addresses memory exhaustion but doesn't solve the underlying storage growth problem.15
Performance can be improved through delta-state CRDTs that transmit only changes rather than full state, and by using efficient encoding like bitmaps for tombstone sets. The Collabs framework demonstrates that optimized CRDT implementations can scale to 100+ simultaneous collaborative editors.131617
CRDTs mathematically guarantee convergence by preserving all operation metadata. Aggressive garbage collection breaks this guarantee—you cannot safely discard tombstones without knowing all replicas have synchronized. Version control systems must support indefinite offline work, making stability detection impossible. This creates an unsolvable tension between scalability and CRDT guarantees.1418
While optimizations help, they don't change the fundamental complexity class. Git repositories with 10+ million commits succeed because git stores snapshots and computes diffs on-demand. CRDTs must maintain operation logs proportional to every edit ever made. No amount of optimization changes this asymptotic difference for massive, long-lived repositories.15
The most promising path forward involves hybrid systems that combine CRDT benefits with traditional version control. GenericVC demonstrates unifying MVCC concurrency control with Git-style versioning, getting benefits of both. A practical approach might use CRDTs for real-time collaborative editing sessions while falling back to snapshot-based storage for long-term history—essentially treating CRDTs as a short-term conflict resolution layer rather than the fundamental storage model.191820
The core insight: CRDTs excel at solving merge conflicts for active collaboration, but version control requires archival storage spanning decades. These goals conflict fundamentally.1813 21
Footnotes
-
https://highscalability.com/paper-crdts-consistency-without-concurrency-control/ ↩ ↩2
-
https://munin.uit.no/bitstream/handle/10037/33830/thesis.pdf?sequence=2\&isAllowed=y ↩ ↩2 ↩3
-
https://pages.lip6.fr/Marc.Shapiro/papers/rgasplit-group2016-11.pdf ↩ ↩2
-
https://blog.kevinjahns.de/are-crdts-suitable-for-shared-editing ↩
-
https://www.dpss.inesc-id.pt/~ler/reports/hugoguerreiro-midterm.pdf ↩
-
https://iankduncan.com/engineering/2025-11-27-crdt-dictionary/ ↩
-
https://github.com/yorkie-team/yorkie/blob/main/design/garbage-collection.md ↩
-
https://decomposition.al/CMPS290S-2018-09/2018/11/12/implementing-a-garbage-collected-graph-crdt-part-1-of-2.html ↩ ↩2 ↩3
-
https://archive.casouri.cc/note/2025/practical-intro-ot/ ↩ ↩2
-
https://munin.uit.no/bitstream/handle/10037/33830/thesis.pdf?sequence=2\&isAllowed=y ↩ ↩2
-
https://dev.to/foxgem/crdts-achieving-eventual-consistency-in-distributed-systems-296g ↩
-
https://dev.to/nyxtom/introduction-to-crdts-for-realtime-collaboration-2eb1 ↩ ↩2 ↩3
-
https://gitprotect.io/blog/git-and-why-version-control-systems-are-now-more-critical-than-ever/ ↩
-
https://iankduncan.com/engineering/2025-11-27-crdt-dictionary/ ↩

Automatic Conflict Resolution vs. Manual Merges
The most fundamental advantage CRDTs have over Git is automatic, deterministic conflict resolution. While Git requires manual intervention when concurrent edits create merge conflicts, CRDTs are mathematically designed to resolve conflicts automatically without human input. This is achieved through commutative operations that produce the same result regardless of the order they're applied.[1][2][3]
In Git, when two developers edit the same lines of code in different branches, Git pauses the merge and asks humans to resolve the conflict. CRDTs eliminate this entirely—all concurrent operations automatically converge to a consistent state across all replicas. This makes CRDTs "conflict-free" by design, whereas Git is fundamentally not a CRDT because its merge operations are order-dependent and require manual resolution.[2][3][4][1]
Peer-to-Peer Synchronization Without Central Coordination
CRDTs excel at decentralized, peer-to-peer collaboration where Git typically requires a central repository. While Git can technically operate in a peer-to-peer manner, the practical reality is that teams use centralized workflows with a canonical "main" repository to coordinate work.[5][6][7][8][9][10]
CRDTs remove the need for a central coordinator entirely. Each replica can accept updates independently and asynchronously, then sync with other replicas in any topology—peer-to-peer, hub-and-spoke, or mesh networks. This architectural flexibility makes CRDTs ideal for applications where a central server is impractical due to scale, network partitions, or the need for true decentralization.[3][9][2]
Offline-First Applications with Guaranteed Convergence
CRDTs provide superior offline-first support with strong eventual consistency guarantees. When users work offline, CRDTs allow local updates that automatically merge when connectivity returns, without conflicts. This property is baked into the data structure itself.[6][11][12][2][3]
Git does support offline work, but merging offline changes often requires manual conflict resolution. More critically, Git's merge behavior isn't guaranteed to be consistent—two different people might resolve the same conflict in different ways, breaking the convergence guarantee that CRDTs provide. With CRDTs, all replicas are mathematically guaranteed to converge to identical states once they've received the same set of updates, regardless of order.[13][14][1][2][3]
Real-Time Collaborative Editing
CRDTs enable real-time collaborative editing in applications like Google Docs, Figma, and Notion. Users can simultaneously edit the same document, and changes appear instantly across all participants without coordination delays.[15][16][17][6]
Git's model is fundamentally asynchronous and batch-oriented—developers work on local copies, then explicitly push and pull changes. While this works well for code development, it's unsuitable for real-time collaboration where users expect immediate feedback. CRDTs naturally support this use case through their conflict-free merging properties.[18][16][7][6][15]
Low-Latency Local Operations
Every operation in a CRDT system can be processed locally with low latency, since replicas don't need to coordinate with a central authority or wait for locks. Users can make changes immediately, and those changes propagate in the background.[19][20][3]
Git also allows local commits, but the merge step—where changes from different sources combine—may require coordination and manual resolution. With CRDTs, the merge is automatic and immediate, maintaining the low-latency experience even during synchronization.[21][1][3][19]
Better for Distributed Systems at Scale
CRDTs are designed for high availability and partition tolerance in distributed systems. They continue functioning correctly even during network partitions, ensuring that each node remains operational and accepts writes independently.[22][2][3][13]
While Git is distributed in architecture, it doesn't provide the same partition tolerance guarantees. Git assumes eventual connectivity and human intervention to resolve divergence. CRDTs, by contrast, are built from the ground up to handle network failures gracefully with automatic convergence.[23][1][3][13]
Use Cases Where CRDTs Shine
CRDTs excel in scenarios that Git wasn't designed for:[24][9]
The Trade-Offs
CRDTs aren't universally better than Git—they make different trade-offs. CRDTs typically require more memory and bandwidth to store metadata for conflict resolution. For text editing, CRDTs store additional information about insertion positions, version vectors, and causal relationships.[5][18][6]
Additionally, CRDTs sacrifice some determinism in how conflicts are resolved. While they guarantee convergence, the merged result may not match human intent—for example, with text edits, a CRDT might interleave words in unexpected ways. Git's manual conflict resolution allows humans to make semantic decisions about which changes to keep.[1][23]
For version control of source code with thoughtful, reviewed changes, Git's model of explicit commits and manual merge resolution remains highly appropriate. CRDTs shine when automatic merging, real-time collaboration, and offline-first operation are paramount.[11][27][23][6][9]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52