I've been thinking of a lot of ideas lately, because with the advent of AI, it is now idea frenzy. I never thought about how I gave away a Schema-less Hierarchical Data Store to the world (C5Store). It's actually incredible piece of software never fully appreciated by me because I used it solely as a configuration store, which it does super well. I don't think about what kind of configuration store to use because I have C5Store to unify it all.
So at Uber, I used to work with this piece of software called Clusto which was used for many things include inventory management by the data center teams. It was a great place to be able to look for information on any host, device, server group, application group, whatever you wanted. It was good except for that time the python client was constantly making unnecessary RPC calls causing needless load on it (I figured out this was happening and slashed down query times for my use from 50+mins to 10 seconds!!!). I digress, Clusto was good for what it was used for data entry, hierarchical structure.
Well well, turns out C5Store could do just that but order of magnitude more efficient. I just realized it this week that holy crap, you could make a service with C5Store and have strong typed models deserialized from the information retrieved from the hierarchical data store. Client side schema could be defined by clients put through a code generator to generate client side structs and functions, basically a whole client that uses the extremely efficient hierarchical store to have what is essentially a Schema-less Hierarchical Data Store.
It opens up a range of software you can EASILY create on top of it. You can walk through the tree as if it were a graph if you wanted to. The possibilities are endless of what you can do with this thing. Order of magnitude efficient core Zanzibar functions? BOOM. Datacenter inventory tracking, server definitions, network interface definitions, check. AI Metadata operations? check. Inventory management is used everywhere. Whatever you want to do with sub millisecond create, read, delete, traversal, hierarchical query operations, this is insane.
You are the client, you control what you want from the database. Let code gen wire up the operations needed to give you what you want from the data store. Fuck yes I am working on this right now for my own use of course. I'd like to make a SaaS product out of it. Just like ready to use clients for different use cases and allow users to create their own schemas to code gen structs. It's the world's first TRULY Schemaless Hierarchical Database.
Again, this was because I was working on upgrading my current custom infrastructure software. I just needed a place to store metadata for servers and hot damn, wham bam I upgraded my metadata v1 to a metadata v2 and saw the flexibility. Crazy!
Common use cases for a schemaless database will always be user settings. Personally I think User accounts too. It's just easier to deal with JSON, that's why it has been shunted into all these databases. It's still shit feature though.
Use Cases among Popular Companies
1. 🧠Google’s Zanzibar (Authorization Systems)
HierarchicalDB drastically simplifies relationship graph storage, especially when embedded into a service like Zanzibar.
-
Efficient descendant/ancestor queries
-
Path resolution and scoped lookups
-
Natural fit for representing resource or subject hierarchies
2. 🛒 Amazon’s Ion / Meta’s Ent (Metadata Systems for Commerce & Content)
Commerce platforms rely on massive object graphs to represent products, SKUs, versions, and vendor metadata.
-
Hierarchical representation of category, asset, or lineage trees
-
Simplifies relational modeling for fast traversal
-
Useful for pricing inheritance, localization, or variant management
3. 🎮 Roblox’s Distributed Game State (Hierarchical Asset Trees)
Games and engines like Roblox store asset hierarchies including meshes, textures, scripts, and ownership trees.
-
Runtime graph traversal of scenes and object states
-
In-editor resolution of logical hierarchies
-
Efficient copy/move and revision-aware subtree operations
4. 📦 AWS Systems Manager Parameter Store / Secrets Manager
These systems already use hierarchical key structures to manage configuration and secrets.
-
Fast lookups by path or ID
-
Scoped environment overrides
-
Built-in TTL support for auto-expiring secrets
5. 🏗️ Google’s Borg / Meta’s Twine (Cluster Management / Deployment)
Cluster and infrastructure systems are naturally hierarchical:
-
Datacenter → Rack → Server → Container → Task
-
Resource inventory traversal
-
Structured rollouts and placement logic
6. 🗂️ Dropbox / Google Drive Metadata Trees
File systems are trees. HierarchicalDB supports:
-
Deep/shallow hierarchy traversal
-
Move/copy subtree operations
-
Lookup by human-readable path or stable ID
7. 🎯 Meta’s Business Graph / LinkedIn’s Org Charts
Modeling real-world orgs with departments, roles, and reporting lines:
-
Traverse upward ("who’s my director?")
-
Enumerate teams or divisions
-
Support onboarding/offboarding flows with subtree operations
8. đź”’ AWS IAM Policy Trees / Azure Role Graphs
IAM systems are deeply hierarchical:
-
Model roles, scopes, and services with clarity
-
Evaluate permission inheritance through ancestry
-
Optimize permission resolution through structural queries
9. 🧾 GitHub’s Internal Dependency Graphs (Code, Packages, Builds)
Build systems and dependency managers often require hierarchical modeling.
-
Resolve downstream dependents efficiently
-
Traverse impacted components for rebuilds
-
Represent monorepo structures naturally
10. 🧬 Knowledge Graph Indexing (Tree-Constrained)
While not a full graph engine, HierarchicalDB works well for constrained, tree-like knowledge graphs:
-
Scoped concept resolution (e.g., AI > Foundation Models)
-
Efficient category traversal
-
Lightweight taxonomy modeling