Following the High Availability milestones in v0.6.0, I’m excited to announce the release of ncps v0.9.0 (and the subsequent v0.9.1 stabilization release). This version introduces a fundamental shift in how ncps handles data: Content-Defined Chunking (CDC) support.
The Headliner: Content-Defined Chunking (CDC)
The biggest change in this release is the introduction of CDC for storage. Previously, ncps relied on simpler storage mechanisms that could face challenges with data consistency and latency in complex high-availability environments.
By implementing CDC, I’ve redesigned ncps to break down NAR files into unique, content-addressable chunks. This brings several key benefits:
- Real-time Streaming in HA: This is the big one. If Instance A is pulling a NAR, all other instances can now start streaming that data to clients almost immediately (within ~150ms). Without CDC, other instances would have to wait for the entire NAR to be fully downloaded and available.
- Deduplication: Identical chunks across different store paths are only stored once, significantly reducing storage overhead.
- Enhanced HA Reliability: CDC ensures that multiple ncps instances remain perfectly synchronized when reading from and writing to shared storage.
A “Lot of Changes” Under the Hood
While CDC is the highlight, v0.9.0 represents a massive internal overhaul. I’ve touched almost every part of the codebase to improve performance and reliability:
- NAR Streaming and Decompression: I’ve refined how NAR files are handled to prevent bottlenecks. A significant part of this involved optimizing how ncps deals with upstream NARs. For a deep dive into why I chose to decompress and the performance trade-offs involved, check out my blog post: CDC: Why Decompression is worth the cost.
- Architectural Cleanup: Significant restructuring of the internal state management and locking logic to support the new storage backend.
- v0.9.1 Stability: Following the v0.9.0 release, v0.9.1 was quickly shipped to address initial edge cases and refine the CDC migration path.
Getting Started
If you are running a High Availability setup, please review the updated documentation, as enabling CDC is now necessary for multi-instance operations. For single-instance users, CDC offers a more robust and storage-efficient way to manage your local cache.
- Release Notes: v0.9.0 | v0.9.1
- GitHub: kalbasit/ncps
As always, thank you to the community for the feedback and testing. Happy building!