We want a single Ceramic node to maintain state and receive updates for a large number of documents at the same time, so that a large number of documents can be pinned by a single node without memory usage growing linearly.
While doing so we want to maintain a certain update semantics on doctypes. The goal is a doctype maintains a separate copy of the state. If it subscribes though, the state is continuously updated, no matter if the document is pinned or not.
In any case, we maintain a LRU cache of documents to achieve a goal of constant memory consumption. This constrains us in a sense that a document in the cache is ephemeral. It can be evicted at any time. The big decision is what concurrency model to adopt for document operations so that it plays nicely with cache eviction.
Document operations can be invoked from few places. One is external API commands: HTTP endpoints of a Ceramic node. Another is JS API: calling in-process Ceramic node. The third one is IPFS pubsub. Also, there is anchoring mechanism, that is a continuous polling of the anchor service. These could be started concurrently.
We want to maintain document state consistency in the presence of concurrent operations. Under no circumstances two operations are allowed to change the same document at the same time.
Option 1: On cache eviction wait for the operations on a document to finish.
With the LRU semantics, it means calling asynchronous function on an evicted document (
await document.close). Since we do not want to end up with having two instances of the same document running in memory at the same time (potential conflict) or having more documents in memory than the configured cache size, cache eviction effectively blocks further documents to be processed until the evicted one is finished.
Option 2: Separate operation on a document from its in-memory lifecycle.
The separation is achieved by introducing an execution queue. The execution queue ensures operations on the same docId are executed sequentially. Before an operation is started, the queue loads a document in memory. When a document is loaded, the oldest document gets evicted. This way we maintain a limited number of documents running in memory while not caring about document processing. We further limit this vector of memory growth by queuing the processing tasks.
We chosen option 2. It allows us to limit memory growth by constant factor, and enables more tight control over different vectors of memory growth.
CeramicConfig.docCacheLimit) - ceiling on memory growth when processing load is low.
CeramicConfig.concurrentRequestsLimit) - ceiling on memory growth when processing load is high.