Cuprate Architecture

WIP

Cuprate's architecture book.

Sections are notated with colors indicating how complete they are:

Color	Meaning
⚪️	Empty
🔴	Severely lacking information
🟠	Lacking some information
🟡	Almost ready
🟢	OK

Continue to the next chapter by clicking the right > button, or by selecting it on the left side.

All chapters are viewable by clicking the top-left ☰ button.

The entire book can searched by clicking the top-left 🔍 button.

Foreword

Monero¹ is a large software project, coming in at 329k lines of C++, C, headers, and make files.² It is directly responsible for 2.6 billion dollars worth of value.³ It has had over 400 contributors, more if counting unnamed contributions.⁴ It has over 10,000 node operators and a large active userbase.⁵

The project wasn't always this big, but somewhere in the midst of contributors coming and going, various features being added, bugs being fixed, and celebrated cryptography being implemented - there was an aspect that was lost by the project that it could not easily gain again: maintainability.

Within large and complicated software projects, there is an important transfer of knowledge that must occur for long-term survival. Much like an organism that must eventually pass the torch onto the next generation, projects must do the same for future contributors.

However, newcomers often lack experience, past contributors might not be around, and current maintainers may be too busy. For whatever reason, this transfer of knowledge is not always smooth.

There is a solution to this problem: documentation.

The activity of writing the what, where, why, and how of the solutions to technical problems can be done in an author's lonesome.

The activity of reading these ideas can be done by future readers at any time without permission.

These readers may be new prospective contributors, it may be the current maintainers, it may be researchers, it may be users of various scale. Whoever it may be, documentation acts as the link between the past and present; a bottle of wisdom thrown into the river of time for future participants to open.

This book is the manifestation of this will, for Cuprate⁶, an alternative Monero node. It documents Cuprate's implementation from head-to-toe such that in the case of a contributor's untimely disappearance, the project can continue.

People come and go, documentation is forever.

— hinto-janai

monero-project/monero

2024-05-24: $143.55 USD * 18,151,608 XMR = $2,605,663,258

⁴

git log --all --pretty="%an" | sort -u | wc -l on cc73fe7

⁵

https://monero.fail/map

⁶

https://github.com/Cuprate/cuprate

Intro

Cuprate is an alternative Monero node implementation.

This book describes Cuprate's architecture, ranging from small things like database pruning to larger meta-components like the networking stack.

A brief overview of some aspects covered within this book:

Component designs
Implementation details
File location and purpose
Design decisions and tradeoffs
Things in relation to monerod
Dependency usage

Source code

The source files for this book can be found on at: https://github.com/Cuprate/architecture-book.

Who this book is for

Maintainers

As mentioned in Foreword, the group of people that benefit from this book's value the most by far are the current and future Cuprate maintainers.

Cuprate's system design is documented in this book such that if you were ever to build it again from scratch, you would have an excellent guide on how to do such, and also where improvements could be made.

Practically, what that means for maintainers is that it acts as the reference. During maintenance, it is quite valuable to have a book that contains condensed knowledge on the behavior of components, or how certain code works, or why it was built a certain way.

Contributors

Contributors also have access to the inner-workings of Cuprate via this book, which helps when making larger contributions.

Design decisions and implementation details notated in this book helps answer questions such as:

Why is it done this way?
Why can it not be done this way?
Were other methods attempted?

Cuprate's testing and benchmarking suites, unknown to new contributors, are also documented within this book.

Researchers

This book contains the why, where, and how of the implementation of formal research.

Although it is an informal specification, this book still acts as a more accessible overview of Cuprate compared to examining the codebase itself.

Operators & users

This book is not a practical guide for using Cuprate itself.

For configuration, data collection (also important for researchers), and other practical usage, see Cuprate's user book.

Observers

Anyone curious enough is free to learn the inner-workings of Cuprate via this book, and maybe even contribute someday.

Required knowledge

General

Rust
Monero
System design

Components

Storage

Embedded databases
LMDB
redb

RPC

axum
tower
async
JSON-RPC 2.0
Epee

Networking

tower
tokio
async
Levin

Instrumentation

tracing

How to use this book

Maintainers

Contributors

Researchers

⚪️ Bird's eye view

⚪️ Map

⚪️ Components

⚪️ Formats, protocols, types

⚪️ monero_serai

⚪️ cuprate_types

⚪️ cuprate_helper

⚪️ Epee

⚪️ Levin

Storage

This section covers all things related to the on-disk storage of data within Cuprate.

Overview

The quick overview is that Cuprate has a database abstraction crate that handles "low-level" database details such as key and value (de)serialization, tables, transactions, etc.

This database abstraction crate is then used by all crates that need on-disk storage, i.e. the

Service

The interface provided by all crates building on-top of the database abstraction is a tower::Service, i.e. database requests/responses are sent/received asynchronously.

As the interface details are similar across crates (threadpool, read operations, write operations), the interface itself is abstracted in the cuprate_database_service crate, which is then used by the crates.

Diagram

This is roughly how database crates are set up.

                                                           ┌─────────────────┐
┌──────────────────────────────────┐                       │                 │
│ Some crate that needs a database │  ┌────────────────┐   │                 │
│                                  │  │     Public     │   │                 │
│ ┌──────────────────────────────┐ │─►│ tower::Service │◄─►│ Rest of Cuprate │
│ │     Database abstraction     │ │  │      API       │   │                 │
│ └──────────────────────────────┘ │  └────────────────┘   │                 │
└──────────────────────────────────┘                       │                 │
                                                           └─────────────────┘

Database abstraction

cuprate_database is Cuprate’s database abstraction.

This crate abstracts various database backends with traits.

All backends have the following attributes:

Embedded
Multiversion concurrency control
ACID
Are (key, value) oriented and have the expected API (get(), insert(), delete())
Are table oriented ("table_name" -> (key, value))
Allows concurrent readers

The currently implemented backends are:

heed (LMDB)
redb

Said precicely, cuprate_database is the embedded database other Cuprate crates interact with instead of using any particular backend implementation. This allows the backend to be swapped and/or future backends to be implemented.

This section will go over cuprate_database details.

Abstraction

This next section details how cuprate_database abstracts multiple database backends into 1 API.

Diagram

A simple diagram describing the responsibilities/relationship of cuprate_database.

┌───────────────────────────────────────────────────────────────────────┐
│ cuprate_database                                                      │
│                                                                       │
│ ┌───────────────────────────┐     ┌─────────────────────────────────┐ │
│ │ Database traits           │     │ Backends                        │ │
│ │ ┌─────┐┌──────┐┌────────┐ │     │ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Env ││ TxRw ││ ...    │ ├─────┤ │ heed (LMDB) │ │ redb        │ │ │
│ │ └─────┘└──────┘└────────┘ │     │ └─────────────┘ └─────────────┘ │ │
│ └──────────┬─────────────┬──┘     └──┬──────────────────────────────┘ │
│            │             └─────┬─────┘                                │
│            │         ┌─────────┴──────────────┐                       │
│            │         │ Database types         │                       │
│            │         │ ┌─────────────┐┌─────┐ │                       │
│            │         │ │ ConcreteEnv ││ ... │ │                       │
│            │         │ └─────────────┘└─────┘ │                       │
│            │         └─────────┬──────────────┘                       │
│            │                   │                                      │
└────────────┼───────────────────┼──────────────────────────────────────┘
             │                   │
             └───────────────────┤
                                 │
                                 ▼
                     ┌───────────────────────┐
                     │ cuprate_database user │
                     └───────────────────────┘

Backend

First, we need an actual database implementation.

cuprate-database's traits allow abstracting over the actual database, such that any backend in particular could be used.

This page is an enumeration of all the backends Cuprate has, has tried, and may try in the future.

`heed`

The default database used is heed (LMDB). The upstream versions from crates.io are used. LMDB should not need to be installed as heed has a build script that pulls it in automatically.

heed's filenames inside Cuprate's data folder are:

Filename	Purpose
`data.mdb`	Main data file
`lock.mdb`	Database lock file

heed-specific notes:

There is a maximum reader limit. Other potential processes (e.g. xmrblocks) that are also reading the data.mdb file need to be accounted for
LMDB does not work on remote filesystem

`redb`

The 2nd database backend is the 100% Rust redb.

The upstream versions from crates.io are used.

redb's filenames inside Cuprate's data folder are:

Filename	Purpose
`data.redb`	Main data file

`redb-memory`

This backend is 100% the same as redb, although, it uses redb::backend::InMemoryBackend which is a database that completely resides in memory instead of a file.

All other details about this should be the same as the normal redb backend.

`sanakirja`

sanakirja was a candidate as a backend, however there were problems with maximum value sizes.

The default maximum value size is 1012 bytes which was too small for our requirements. Using sanakirja::Slice and sanakirja::UnsizedStorage was attempted, but there were bugs found when inserting a value in-between 512..=4096 bytes.

As such, it is not implemented.

`MDBX`

MDBX was a candidate as a backend, however MDBX deprecated the custom key/value comparison functions, this makes it a bit trickier to implement multimap tables. It is also quite similar to the main backend LMDB (of which it was originally a fork of).

As such, it is not implemented (yet).

`ConcreteEnv`

After a backend is selected, the main database environment struct is "abstracted" by putting it in the non-generic, concrete struct ConcreteEnv.

This is the main object used when handling the database directly.

This struct contains all the data necessary to operate the database. The actual database backend ConcreteEnv will use internally depends on which backend feature is used.

ConcreteEnv itself is not too important, what is important is that:

It allows callers to not directly reference any particular backend environment
It implements trait Env which opens the door to all the other database traits

The equivalent "database environment" objects in the backends themselves are:

Trait

cuprate_database provides a set of traits that abstract over the various database backends.

This allows the function signatures and behavior to stay the same but allows for swapping out databases in an easier fashion.

All common behavior of the backend's are encapsulated here and used instead of using the backend directly.

Examples:

For example, instead of calling heed or redb's get() function directly, DatabaseRo::get() is called.

Usage

With a ConcreteEnv and a particular backend selected, we can now start using it alongside these traits to start doing database operations in a generic manner.

An example:

#![allow(unused)]
fn main() {
use cuprate_database::{
    ConcreteEnv,
    config::ConfigBuilder,
    Env, EnvInner,
    DatabaseRo, DatabaseRw, TxRo, TxRw,
};

// Initialize the database environment.
let env = ConcreteEnv::open(config)?;

// Open up a transaction + tables for writing.
let env_inner = env.env_inner();
let tx_rw = env_inner.tx_rw()?;
env_inner.create_db::<Table>(&tx_rw)?;

// Write data to the table.
{
	let mut table = env_inner.open_db_rw::<Table>(&tx_rw)?;
	table.put(&0, &1)?;
}

// Commit the transaction.
TxRw::commit(tx_rw)?;
}

As seen above, there is no direct call to heed or redb. Their functionality is abstracted behind ConcreteEnv and the traits.

Syncing

cuprate_database's database has 5 disk syncing modes.

FastThenSafe
Safe
Async
Threshold
Fast

The default mode is Safe.

This means that upon each transaction commit, all the data that was written will be fully synced to disk. This is the slowest, but safest mode of operation.

Note that upon any database Drop, the current implementation will sync to disk regardless of any configuration.

For more information on the other modes, read the documentation here.

Resizing

cuprate_database itself does not handle memory map resizes automatically (for database backends that need resizing, i.e. heed/LMDB).

When a user directly using cuprate_database, it is up to them on how to resize. The database will return RuntimeError::ResizeNeeded when it needs resizing.

However, cuprate_database exposes some resizing algorithms that define how the database's memory map grows.

(De)serialization

All types stored inside the database are either bytes already or are perfectly bitcast-able.

As such, they do not incur heavy (de)serialization costs when storing/fetching them from the database. The main (de)serialization used is bytemuck's traits and casting functions.

Size and layout

The size & layout of types is stable across compiler versions, as they are set and determined with #[repr(C)] and bytemuck's derive macros such as bytemuck::Pod.

Note that the data stored in the tables are still type-safe; we still refer to the key and values within our tables by the type.

How

The main deserialization trait for database storage is Storable.

Before storage, the type is simply cast into bytes
When fetching, the bytes are simply cast into the type

When a type is casted into bytes, the reference is casted, i.e. this is zero-cost serialization.

However, it is worth noting that when bytes are casted into the type, it is copied. This is due to byte alignment guarantee issues with both backends, see:

Without this, bytemuck will panic with TargetAlignmentGreaterAndInputNotAligned when casting.

Copying the bytes fixes this problem, although it is more costly than necessary. However, in the main use-case for cuprate_database (tower::Service API) the bytes would need to be owned regardless as the Request/Response API uses owned data types (T, Vec<T>, HashMap<K, V>, etc).

Practically speaking, this means lower-level database functions that normally look like such:

#![allow(unused)]
fn main() {
fn get(key: &Key) -> &Value;
}

end up looking like this in cuprate_database:

#![allow(unused)]
fn main() {
fn get(key: &Key) -> Value;
}

Since each backend has its own (de)serialization methods, our types are wrapped in compatibility types that map our Storable functions into whatever is required for the backend, e.g:

Compatibility structs also exist for any Storable containers:

Again, it's unfortunate that these must be owned, although in the tower::Service use-case, they would have to be owned anyway.

Known issues and tradeoffs

cuprate_database takes many tradeoffs, whether due to:

Prioritizing certain values over others
Not having a better solution
Being "good enough"

This section is a list of the larger ones, along with issues that don't have answers yet.

Traits abstracting backends

Although all database backends used are very similar, they have some crucial differences in small implementation details that must be worked around when conforming them to cuprate_database's traits.

Put simply: using cuprate_database's traits is less efficient and more awkward than using the backend directly.

For example:

This is a tradeoff that cuprate_database takes, as:

The backend itself is usually not the source of bottlenecks in the greater system, as such, small inefficiencies are OK
None of the lost functionality is crucial for operation
The ability to use, test, and swap between multiple database backends is worth it

Hot-swappable backends

See also: https://github.com/Cuprate/cuprate/issues/209.

Using a different backend is really as simple as re-building cuprate_database with a different feature flag:

# Use LMDB.
cargo build --package cuprate-database --features heed

# Use redb.
cargo build --package cuprate-database --features redb

This is "good enough" for now, however ideally, this hot-swapping of backends would be able to be done at runtime.

As it is now, cuprate_database cannot compile both backends and swap based on user input at runtime; it must be compiled with a certain backend, which will produce a binary with only that backend.

This also means things like CI testing multiple backends is awkward, as we must re-compile with different feature flags instead.

Copying unaligned bytes

As mentioned in (De)serialization, bytes are copied when they are turned into a type T due to unaligned bytes being returned from database backends.

Using a regular reference cast results in an improperly aligned type T; such a type even existing causes undefined behavior. In our case, bytemuck saves us by panicking before this occurs.

Thus, when using cuprate_database's database traits, an owned T is returned.

This is doubly unfortunately for &[u8] as this does not even need deserialization.

For example, StorableVec could have been this:

#![allow(unused)]
fn main() {
enum StorableBytes<'a, T: Storable> {
    Owned(T),
    Ref(&'a T),
}
}

but this would require supporting types that must be copied regardless with the occasional &[u8] that can be returned without casting. This was hard to do so in a generic way, thus all [u8]'s are copied and returned as owned StorableVecs.

This is a tradeoff cuprate_database takes as:

bytemuck::pod_read_unaligned is cheap enough
The main API, service, needs to return owned value anyway
Having no references removes a lot of lifetime complexity

The alternative is somehow fixing the alignment issues in the backends mentioned previously.

Endianness

cuprate_database's (de)serialization and storage of bytes are native-endian, as in, byte storage order will depend on the machine it is running on.

As Cuprate's build-targets are all little-endian (big-endian by default machines barely exist), this doesn't matter much and the byte ordering can be seen as a constant.

Practically, this means cuprated's database files can be transferred across computers, as can monerod's.

Multimap

cuprate_database does not currently have an abstraction for multimap tables.

All tables are single maps of keys to values.

This matters as this means some of cuprate_blockchain's tables differ from monerod's tables - the primary key is stored for all entries, compared to monerod only needing to store it once:

#![allow(unused)]
fn main() {
// `monerod` only stores `amount: 1` once,
// `cuprated` stores it each time it appears.
struct PreRctOutputId { amount: 1, amount_index: 0 }
struct PreRctOutputId { amount: 1, amount_index: 1 }
}

This means cuprated's database will be slightly larger than monerod's.

The current method cuprate_blockchain uses will be "good enough" as the multimap keys needed for now are fixed, e.g. pre-RCT outputs are no longer being produced.

This may need to change in the future when multimap is all but required, e.g. for FCMP++.

Until then, multimap tables are not implemented as they are tricky to implement across all backends.

Common behavior

The crates that build on-top of the database abstraction (cuprate_database) share some common behavior including but not limited to:

Defining their specific database tables and types
Having an ops module
Exposing a tower::Service API (backed by a threadpool) for public usage

This section provides more details on these behaviors.

Types

POD types

Since all types in the database are POD types, we must often provide mappings between outside types and the types actually stored in the database.

A common case is mapping infallible types to and from bitflags and/or their raw integer representation. For example, the OutputFlag type or bool types.

As types like enums, bools and chars cannot be casted from an integer infallibly, bytemuck::Pod cannot be implemented on it safely. Thus, we store some infallible version of it inside the database with a custom type and map them when fetching the data.

Lean types

Another reason why database crates define their own types is to cut any unneeded data from the type.

Many of the types used in normal operation (e.g. cuprate_types::VerifiedBlockInformation) contain lots of extra pre-processed data for convenience.

This would be a waste to store in the database, so in this example, the much leaner "raw" BlockInfo type is stored.

`ops`

Both cuprate_blockchain and cuprate_txpool expose an ops module containing abstracted abstracted Monero-related database operations.

For example, cuprate_blockchain::ops::block::add_block.

These functions build on-top of the database traits and allow for more abstracted database operations.

For example, instead of these signatures:

#![allow(unused)]
fn main() {
fn get(_: &Key) -> Value;
fn put(_: &Key, &Value);
}

the ops module provides much higher-level signatures like such:

#![allow(unused)]
fn main() {
fn add_block(block: &Block) -> Result<_, _>;
}

Although these functions are exposed, they are not the main API, that would be next section: the tower::Service (which uses these functions).

tower::Service

Both cuprate_blockchain and cuprate_txpool provide async tower::Services that define database requests/responses.

The main API that other Cuprate crates use.

There are 2 tower::Services:

A read service which is backed by a rayon::ThreadPool
A write service which spawns a single thread to handle write requests

As this behavior is the same across all users of cuprate_database, it is extracted into its own crate: cuprate_database_service.

Diagram

As a recap, here is how this looks to a user of a higher-level database crate, cuprate_blockchain in this example. Starting from the lowest layer:

cuprate_database is used to abstract the database
cuprate_blockchain builds on-top of that with tables, types, operations
cuprate_blockchain exposes a tower::Service using cuprate_database_service
The user now interfaces with cuprate_blockchain with that tower::Service in a request/response fashion

                         ┌──────────────────┐
                         │ cuprate_database │
                         └────────┬─────────┘
┌─────────────────────────────────┴─────────────────────────────────┐
│ cuprate_blockchain                                                │
│                                                                   │
│ ┌──────────────────────┐  ┌─────────────────────────────────────┐ │
│ │ Tables, types        │  │ ops                                 │ │
│ │ ┌───────────┐┌─────┐ │  │ ┌─────────────┐ ┌──────────┐┌─────┐ │ │
│ │ │ BlockInfo ││ ... │ ├──┤ │ add_block() │ │ add_tx() ││ ... │ │ │
│ │ └───────────┘└─────┘ │  │ └─────────────┘ └──────────┘└─────┘ │ │
│ └──────────────────────┘  └─────┬───────────────────────────────┘ │
│                                 │                                 │
│                       ┌─────────┴───────────────────────────────┐ │
│                       │ tower::Service                          │ │
│                       │ ┌──────────────────────────────┐┌─────┐ │ │
│                       │ │ Blockchain{Read,Write}Handle ││ ... │ │ │
│                       │ └──────────────────────────────┘└─────┘ │ │
│                       └─────────┬───────────────────────────────┘ │
│                                 │                                 │
└─────────────────────────────────┼─────────────────────────────────┘
                                  │
		                    ┌─────┴─────┐
       ┌────────────────────┴────┐ ┌────┴──────────────────────────────────┐
       │ Database requests       │ │ Database responses                    │
       │ ┌─────────────────────┐ │ │ ┌───────────────────────────────────┐ │
       │ │ FindBlock([u8; 32]) │ │ │ │ FindBlock(Option<(Chain, usize)>) │ │
       │ └─────────────────────┘ │ │ └───────────────────────────────────┘ │
       │ ┌─────────────────────┐ │ │ ┌───────────────────────────────────┐ │
       │ │ ChainHeight         │ │ │ │ ChainHeight(usize, [u8; 32])      │ │
       │ └─────────────────────┘ │ │ └───────────────────────────────────┘ │
       │ ┌─────────────────────┐ │ │ ┌───────────────────────────────────┐ │
       │ │ ...                 │ │ │ │ ...                               │ │
       │ └─────────────────────┘ │ │ └───────────────────────────────────┘ │
       └─────────────────────────┘ └───────────────────────────────────────┘
                            ▲          │
                            │          ▼
                     ┌─────────────────────────┐
                     │ cuprate_blockchain user │
                     └─────────────────────────┘

Initialization

A database service is started simply by calling: init().

This function initializes the database, spawns threads, and returns a:

Read handle to the database
Write handle to the database
The database itself

These handles implement the tower::Service trait, which allows sending requests and receiving responses asynchronously.

Requests

Along with the 2 handles, there are 2 types of requests:

Read requests, e.g. BlockchainReadRequest
Write requests, e.g. BlockchainWriteRequest

Quite obviously:

Read requests are for retrieving various data from the database
Write requests are for writing data to the database

Responses

After sending a request using the read/write handle, the value returned is not the response, yet an asynchronous channel that will eventually return the response:

// Send a request.
//                                   tower::Service::call()
//                                          V
let response_channel: Channel = read_handle.call(BlockchainReadRequest::ChainHeight)?;

// Await the response.
let response: BlockchainReadRequest = response_channel.await?;

After awaiting the returned channel, a Response will eventually be returned when the Service threadpool has fetched the value from the database and sent it off.

Both read/write requests variants match in name with Response variants, i.e.

BlockchainReadRequest::ChainHeight leads to BlockchainResponse::ChainHeight
BlockchainWriteRequest::WriteBlock leads to BlockchainResponse::WriteBlockOk

Resizing

As noted in the cuprate_database resizing section, builders on-top of cuprate_database are responsible for resizing the database.

In cuprate_{blockchain,txpool}'s case, that means the tower::Service must know how to resize. This logic is shared between both crates, defined in cuprate_database_service: https://github.com/Cuprate/cuprate/blob/0941f68efcd7dfe66124ad0c1934277f47da9090/storage/service/src/service/write.rs#L107-L171.

By default, this uses a similar algorithm as monerod's:

If there's not enough space to fit a write request's data, start a resize
Each resize adds around 1,073,745,920 bytes to the current map size
A resize will be attempted 3 times before failing

There are other resizing algorithms that define how the database's memory map grows, although currently the behavior of monerod is closely followed (for no particular reason).

Thread model

The base database abstractions themselves are not concerned with parallelism, they are mostly functions to be called from a single-thread.

However, the cuprate_database_service API, does have a thread model backing it.

When a Service's init() function is called, threads will be spawned and maintained until the user drops (disconnects) the returned handles.

The current behavior for thread count is:

For example, on a system with 32-threads, cuprate_database_service will spawn:

1 writer thread
32 reader threads

whose sole responsibility is to listen for database requests, access the database (potentially in parallel), and return a response.

Note that the 1 system thread = 1 reader thread model is only the default setting, the reader thread count can be configured by the user to be any number between 1 .. amount_of_system_threads.

The reader threads are managed by rayon.

For an example of where multiple reader threads are used: given a request that asks if any key-image within a set already exists, cuprate_blockchain will split that work between the threads with rayon.

Shutdown

Once the read/write handles to the tower::Service are Droped, the backing thread(pool) will gracefully exit, automatically.

Note the writer thread and reader threadpool aren't connected whatsoever; dropping the write handle will make the writer thread exit, however, the reader handle is free to be held onto and can be continued to be read from - and vice-versa for the write handle.

Blockchain

This section contains storage information specific to cuprate_blockchain, the database built on-top of cuprate_database that stores the blockchain.

Schema

This section contains the schema of cuprate_blockchain's database tables.

Tables

See also: https://doc.cuprate.org/cuprate_blockchain/tables & https://doc.cuprate.org/cuprate_blockchain/types.

The CamelCase names of the table headers documented here (e.g. TxIds) are the actual type name of the table within cuprate_blockchain.

Note that words written within code blocks mean that it is a real type defined and usable within cuprate_blockchain. Other standard types like u64 and type aliases (TxId) are written normally.

Within cuprate_blockchain::tables, the below table is essentially defined as-is with a macro.

Many of the data types stored are the same data types, although are different semantically, as such, a map of aliases used and their real data types is also provided below.

Alias	Real Type
BlockHeight, Amount, AmountIndex, TxId, UnlockTime	u64
BlockHash, KeyImage, TxHash, PrunableHash	[u8; 32]

Table	Key	Value	Description
`BlockHeaderBlobs`	BlockHeight	`StorableVec<u8>`	Maps a block's height to a serialized byte form of its header
`BlockTxsHashes`	BlockHeight	`StorableVec<[u8; 32]>`	Maps a block's height to the block's transaction hashes
`BlockHeights`	BlockHash	BlockHeight	Maps a block's hash to its height
`BlockInfos`	BlockHeight	`BlockInfo`	Contains metadata of all blocks
`KeyImages`	KeyImage	()	This table is a set with no value, it stores transaction key images
`NumOutputs`	Amount	u64	Maps an output's amount to the number of outputs with that amount
`Outputs`	`PreRctOutputId`	`Output`	This table contains legacy CryptoNote outputs which have clear amounts. This table will not contain an output with 0 amount.
`PrunedTxBlobs`	TxId	`StorableVec<u8>`	Contains pruned transaction blobs (even if the database is not pruned)
`PrunableTxBlobs`	TxId	`StorableVec<u8>`	Contains the prunable part of a transaction
`PrunableHashes`	TxId	PrunableHash	Contains the hash of the prunable part of a transaction
`RctOutputs`	AmountIndex	`RctOutput`	Contains RingCT outputs mapped from their global RCT index
`TxBlobs`	TxId	`StorableVec<u8>`	Serialized transaction blobs (bytes)
`TxIds`	TxHash	TxId	Maps a transaction's hash to its index/ID
`TxHeights`	TxId	BlockHeight	Maps a transaction's ID to the height of the block it comes from
`TxOutputs`	TxId	`StorableVec<u64>`	Gives the amount indices of a transaction's outputs
`TxUnlockTime`	TxId	UnlockTime	Stores the unlock time of a transaction (only if it has a non-zero lock time)

Multimap tables

Outputs

When referencing outputs, Monero will use the amount and the amount index. This means 2 keys are needed to reach an output.

With LMDB you can set the DUP_SORT flag on a table and then set the key/value to:

#![allow(unused)]
fn main() {
Key = KEY_PART_1
}

#![allow(unused)]
fn main() {
Value = {
    KEY_PART_2,
    VALUE // The actual value we are storing.
}
}

Then you can set a custom value sorting function that only takes KEY_PART_2 into account; this is how monerod does it.

This requires that the underlying database supports:

multimap tables
custom sort functions on values
setting a cursor on a specific key/value

How `cuprate_blockchain` does it

Another way to implement this is as follows:

#![allow(unused)]
fn main() {
Key = { KEY_PART_1, KEY_PART_2 }
}

#![allow(unused)]
fn main() {
Value = VALUE
}

Then the key type is simply used to look up the value; this is how cuprate_blockchain does it as cuprate_database does not have a multimap abstraction (yet).

For example, the key/value pair for outputs is:

#![allow(unused)]
fn main() {
PreRctOutputId => Output
}

where PreRctOutputId looks like this:

#![allow(unused)]
fn main() {
struct PreRctOutputId {
    amount: u64,
    amount_index: u64,
}
}

⚪️ Transaction pool

⚪️ Pruning

RPC

monerod's daemon RPC has three kinds of RPC calls:

JSON-RPC 2.0 methods, called at the /json_rpc endpoint
JSON (but not JSON-RPC 2.0) methods called at their own endpoints, e.g. /get_height
Binary (epee) RPC methods called at their own endpoints ending in .bin, e.g. /get_blocks.bin

Cuprate's RPC aims to mirror monerod's as much as it practically can.

This includes, but is not limited to:

Using the same endpoints
Receiving the same request data
Sending the same response data
Responding with the same HTTP status codes
Following internal behavior (e.g. /pop_blocks)

Not all monerod behavior can always be followed, however.

Some are not followed on purpose, some cannot be followed due to technical limitations, and some cannot be due to the behavior being monerod specific such as the /set_log_categories endpoint which uses monerod's logging categories.

Both subtle and large differences between Cuprate's RPC and monerod's RPC are documented in the Differences with monerod section.

Main RPC components

The main components that make up Cuprate's RPC are noted below, alongside the equivalent monerod code and other notes.

Cuprate crate	`monerod` (rough) equivalent	Purpose	Notes
`cuprate-json-rpc`	`jsonrpc_structs.h`, `http_server_handlers_map2.h`	JSON-RPC 2.0 implementation	`monerod`'s JSON-RPC 2.0 handling is spread across a few files. The first defines some data structures, the second contains macros that (essentially) implement JSON-RPC 2.0.
`cuprate-rpc-types`	`core_rpc_server_commands_defs.h`	RPC request/response type definitions and (de)serialization
`cuprate-rpc-interface`	`core_rpc_server.h`	RPC interface, routing, endpoints
`cuprate-rpc-handler`	`core_rpc_server.cpp`	RPC request/response handling	These are the "inner handler" functions that turn requests into responses

JSON-RPC 2.0

Cuprate has a standalone crate that implements the JSON-RPC 2.0 specification, cuprate-json-rpc. The RPC methods at the /json_rpc endpoint use this crate's types, functions, and (de)serialization.

There is nothing too special about Cuprate's implementation. Any small notes and differences are noted in the crate documentation.

As such, there is not much to document here, instead, consider reading the very brief JSON-RPC 2.0 specification, and the cuprate-json-rpc crate documentation.

TODO: document method/params vs flattened base when figured out.

The types

Cuprate has a crate that defines all the types related to RPC: cuprate-rpc-types.

The main purpose of this crate is to port the types used in monerod's RPC and to re-implement (de)serialization for those types, whether that be JSON, epee, or a custom mix.

The bulk majority of these types are request & response types, i.e. the inputs Cuprate's RPC is expecting from users, and the output it will respond with.

Example

To showcase an example of the kinds of types defined in this crate, here is a request type:

#![allow(unused)]
fn main() {
#[serde(transparent)]
#[repr(transparent)]
struct OnGetBlockHashRequest {
	block_height: [u64; 1],
}
}

This is the input (params) expected in the on_get_block_hash method.

As seen above, the type itself encodes some properties, such as being (de)serialized transparently, and the input being an array with 1 length, rather than a single u64. This is to match the behavior of monerod.

An example JSON form of this type would be:

{
  "jsonrpc": "2.0",
  "id": "0",
  "method": "on_get_block_hash",
  "params": [912345] // <- This can (de)serialize as a `OnGetBlockHashRequest`
}

Misc types

Other than the main request/response types, this crate is also responsible for any miscellaneous types used within monerod's RPC.

For example, the status field within many RPC responses is defined within cuprate-rpc-types.

Types that aren't requests/responses but exist within request/response types are also defined in this crate, such as the Distribution structure returned from the get_output_distribution method.

Base RPC types

There exists a few "base" types that many types are built on-top of in monerod. These are also implemented in cuprate-rpc-types.

For example, many requests include these 2 fields:

{
  "status": "OK",
  "untrusted": false,
}

This is rpc_response_base in monerod, and ResponseBase in Cuprate.

These types are flattened into other types, i.e. the fields from these base types are injected into the given type. For example, get_block_count's response type is defined like such in Cuprate:

#![allow(unused)]
fn main() {
struct GetBlockCountResponse {
	// The fields of this `base` type are directly
	// injected into `GetBlockCountResponse` during
	// (de)serialization.
	//
	// I.e. it is as if this `base` field were actually these 2 fields:
	// status: Status,
	// untrusted: bool,
    base: ResponseBase,
	count: u64,
}
}

The JSON output of this type would look something like:

{
  "status": "OK",
  "untrusted": "false",
  "count": 993163
}

RPC payment

monerod also contains RPC base types for the RPC payment system. Although the RPC payment system is pseudo deprecated, monerod still generates these fields in responses, and thus, so does Cuprate.

The type generator macro

Request and response types make up the majority of cuprate-rpc-types.

Request types are the inputs expected from users
Response types are what will be outputted to users

Regardless of being meant for JSON-RPC, binary, or a standalone JSON endpoint, all request/response types are defined using the "type generator macro". This macro is important because it defines all request/response types.

This macro:

Defines a matching pair of request & response types
Implements many derive traits, e.g. Clone on those types
Implements both serde and epee on those types
Automates documentation, tests, etc.

See here for example usage of this macro.

Metadata

cuprate-rpc-types also provides some traits to access some metadata surrounding RPC data types.

For example, trait RpcCall allows accessing whether an RPC request is restricted or not.

monerod has a boolean permission system. RPC calls can be restricted or not. If an RPC call is restricted, it will only be allowed on un-restricted RPC servers (18081). If an RPC call is not restricted, it will be allowed on all RPC server types (18081 & 18089).

This metadata is used in crates that build upon cuprate-rpc-types, e.g. to know if an RPC call should be allowed through or not.

(De)serialization

A crucial responsibility of cuprate-rpc-types is to provide the correct (de)serialization of types.

The input/output of Cuprate's RPC should match monerod (as much as practically possible).

A simple example of this is that /get_height should respond with the exact same data for both monerod and Cuprate:

{
  "hash": "7e23a28cfa6df925d5b63940baf60b83c0cbb65da95f49b19e7cf0ce7dd709ce",
  "height": 2287217,
  "status": "OK",
  "untrusted": false
}

Behavior would be considered incompatible if any of the following were true:

Fields are missing
Extra fields exist
Field types are incorrect (string instead of number, etc)

JSON

(De)serialization for JSON is implemented using serde and serde_json.

cuprate-rpc-interface (the main crate responsible for the actual output) uses serde_json for JSON formatting. It is mostly the same formatting as monerod, although there are slight differences.

Technically, the formatting of the JSON output is not handled by cuprate-rpc-types, users are free to choose whatever formatting they desire.

Epee

(De)serialization for the epee binary format is handled by Cuprate's in-house cuprate-epee-encoding library.

Bitcasted `struct`s

TODO: https://github.com/monero-project/monero/issues/9422

Compressed data

TODO: https://github.com/monero-project/monero/issues/9422

The interface

This section is short as cuprate-rpc-interface contains detailed documentation.

The RPC interface, which includes:

Endpoint routing (/json_rpc, /get_blocks.bin, etc)
Route function signatures (async fn json_rpc(...) -> Response)
Type (de)serialization
Any miscellaneous handling (denying restricted RPC calls)

is handled by the cuprate-rpc-interface crate.

Essentially, this crate provides the API for the RPC.

cuprate-rpc-interface is built on-top of axum and tower, which are the crates doing the bulk majority of the work.

Request -> Response

The functions that map requests to responses are not implemented by cuprate-rpc-interface itself, they must be provided by the user, i.e. it can be customized.

In Rust terms, this crate provides you with:

#![allow(unused)]
fn main() {
async fn json_rpc(
	state: State,
	request: Request,
) -> Response {
	/* your handler here */
}
}

and you provide the function body.

The main handler crate is cuprate-rpc-handler. This crate implements the standard RPC behavior, i.e. it mostly mirrors monerod.

Although, it's worth noting that other implementations are possible, such as an RPC handler that caches blocks, or an RPC handler that only accepts certain endpoints, or any combination.

The handler

TODO: fill after cuprate-rpc-handler is created.

🔴 The server

TODO: fill after cuprate-rpc-server or binary impl is created.

Differences with `monerod`

As noted in the introduction, monerod's RPC behavior cannot always be perfectly followed by Cuprate.

The reasoning for the differences can vary from:

Technical limitations
Behavior being monerod-specific
Purposeful decision to not support behavior

This section lays out the details of the differences between monerod's and Cuprate's RPC system.

JSON field ordering

When serializing JSON, monerod has the behavior to order key fields within a scope alphabetically.

For example:

{
  "id": "0",
  "jsonrpc": "2.0",
  "result": {
    "blockhashing_blob": "...",
    "blocktemplate_blob": "...",
    "difficulty": 283305047039,
    "difficulty_top64": 0,
    "expected_reward": 600000000000,
    "height": 3195018,
    "next_seed_hash": "",
    "prev_hash": "9d648e741d85ca0e7acb4501f051b27e9b107d3cd7a3f03aa7f776089117c81a",
    "reserved_offset": 131,
    "seed_hash": "e2aa0b7b55042cd48b02e395d78fa66a29815ccc1584e38db2d1f0e8485cd44f",
    "seed_height": 3194880,
    "status": "OK",
    "untrusted": false,
    "wide_difficulty": "0x41f64bf3ff"
  }
}

In the main {}, id comes before jsonrpc, which comes before result.

The same alphabetical ordering is applied to the fields within result.

Cuprate uses serde for JSON serialization, which serializes fields based on the definition order, i.e. whatever order the fields are defined in the code, is the order they will appear in JSON.

Some struct fields within Cuprate's RPC types happen to be alphabetical, but this is not a guarantee.

As these are JSON maps, the ordering of fields should not matter, although this is something to note as the output will technically differ.

Example incompatibility

An example of where this leads to incompatibility is if specific line numbers are depended on to contain specific fields.

For example, this will print the 10th line:

curl http://127.0.0.1:18081/json_rpc -d '{"jsonrpc":"2.0","id":"0","method":"get_block_template","params":{"wallet_address":"44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns","reserve_size":60}' -H 'Content-Type: application/json' | sed -n 10p

It will be "height": 3195018 in monerod's case, but may not necessarily be for Cuprate.

By all means, this should not be relied upon in the first place, although it is shown as an example.

JSON formatting

In general, Cuprate's JSON formatting is very similar to monerod, but there are some differences.

This is a list of those differences.

Pretty vs compact

TODO: decide when handlers are created if we should allow custom formatting.

Cuprate's RPC (really, serde_json) can be configured to use either:

monerod uses something similar to pretty formatting.

As an example, pretty formatting:

{
  "number": 1,
  "array": [
    0,
    1
  ],
  "string": "",
  "array_of_objects": [
    {
      "x": 1.0,
      "y": -1.0
    },
    {
      "x": 2.0,
      "y": -2.0
    }
  ]
}

compact formatting:

{"number":1,"array":[0,1],"string":"","array_of_objects":[{"x":1.0,"y":-1.0},{"x":2.0,"y":-2.0}]}

Array of objects

monerod will format an array of objects like such:

{
  "array_of_objects": [{
    "x": 0.0,
    "y": 0.0,
  },{
    "x": 0.0,
    "y": 0.0,
  },{
    "x": 0.0,
    "y": 0.0
  }]
}

Cuprate will format the above like such:

{
  "array_of_objects": [
    {
      "x": 0.0,
      "y": 0.0,
    },
    {
      "x": 0.0,
      "y": 0.0,
    },
    {
      "x": 0.0,
      "y": 0.0
    }
  ]
}

Array of maps containing named objects

An method that contains outputs like this is the peers field in the sync_info method:

curl \
    http://127.0.0.1:18081/json_rpc \
    -d '{"jsonrpc":"2.0","id":"0","method":"sync_info"}' \
    -H 'Content-Type: application/json'

monerod will format an array of maps that contains named objects like such:

{
  "array": [{
    "named_object": {
      "field": ""
    }
  },{
    "named_object": {
      "field": ""
    }
  }]
}

Cuprate will format the above like such:

{
  "array": [
    {
      "named_object": {
        "field": ""
      }
    },
    {
      "named_object": {
        "field": ""
      }
    }
  ]
}

JSON strictness

This is a list of behavior that monerod's JSON parser allows, that Cuprate's JSON parser (serde_json) does not.

In general, monerod's parser is quite lenient, allowing invalid JSON in many cases. Cuprate's (really, serde_json) JSON parser is quite strict, essentially sticking to the JSON specification.

Cuprate also makes some decisions that are different than monerod, but are not necessarily more or less strict.

Missing closing bracket

monerod will accept JSON missing a final closing }.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","id":"0","method":"get_block_count"' \
	-H 'Content-Type: application/json'

Trailing ending comma

monerod will accept JSON containing a final trailing ,.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","id":"0","method":"get_block_count",}' \
	-H 'Content-Type: application/json'

Allowing `-` in fields

monerod allows - as a valid value in certain fields, not a string "-", but the character -.

The fields where this is allowed seems to be any field monerod does not explicitly look for, examples include:

jsonrpc
id
params (where parameters are not expected)
Any ignored field

The JSON-RPC 2.0 specification does state that the response id should be null upon errors in detecting the request id, although in this case, this is invalid JSON and should not make it this far. The response will contain the default id: 0 in this case.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":-,"id":-,"params":-,"IGNORED_FIELD":-,"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

JSON-RPC strictness

This is a list of behavior that monerod's JSON-RPC implementation allows, that Cuprate's JSON-RPC implementation does not.

In general, monerod's JSON-RPC is quite lenient, going against the specification in many cases. Cuprate's JSON-RPC implementation is slightly more strict.

Cuprate also makes some decisions that are different than monerod, but are not necessarily more or less strict.

Allowing an incorrect `jsonrpc` field

The JSON-RPC 2.0 specification states that the jsonrpc field must be exactly "2.0".

monerod allows jsonrpc to:

Be any string
Be an empty array
Be null
Not exist at all

Examples:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"???","method":"get_block_count"}' \
	-H 'Content-Type: application/json'

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":[],"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":null,"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

Allowing `id` to be any type

JSON-RPC 2.0 responses must contain the same id as the original request.

However, the specification states:

An identifier established by the Client that MUST contain a String, Number, or NULL value if included

monerod does not check this and allows id to be any JSON type, for example, a map:

curl \
    http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","id":{"THIS":{"IS":"ALLOWED"}},"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

The response:

{
  "id": {
    "THIS": {
      "IS": "ALLOWED"
    }
  },
  "jsonrpc": "2.0",
  "result": {
    "count": 3210225,
    "status": "OK",
    "untrusted": false
  }
}

Responding with `id:0` on error

The JSON-RPC specification states:

If there was an error in detecting the id in the Request object (e.g. Parse error/Invalid Request), it MUST be Null.

Although, monerod will respond with id:0 in these cases.

curl \
    http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","id":asdf,"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

Response:

{
  "error": {
    "code": -32700,
    "message": "Parse error"
  },
  "id": 0,
  "jsonrpc": "2.0"
}

Responding to notifications

TODO: decide on Cuprate behavior https://github.com/Cuprate/cuprate/pull/233#discussion_r1704611186

Requests that have no id field are "notifications".

The JSON-RPC 2.0 specification states that requests without an id field must not be responded to.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","method":"get_block_count"}' \
	-H 'Content-Type: application/json'

Upper/mixed case fields

monerod will accept upper/mixed case fields on:

jsonrpc
id

method however, is checked.

The JSON-RPC 2.0 specification does not outright state what case to support, although, Cuprate only supports lowercase as supporting upper/mixed case is more code to add as serde by default is case-sensitive on struct fields.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsONrPc":"2.0","iD":0,"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

HTTP methods

monerod endpoints supports multiple HTTP methods that do not necessarily make sense.

For example:

curl \
	http://127.0.0.1:18081/get_limit \
	-H 'Content-Type: application/json' \
	--request DELETE

This is sending an HTTP DELETE request, which should be a GET.

monerod will respond to this the same as GET, POST, PUT, and TRACE.

Cuprate's behavior

TODO: decide allowed HTTP methods for Cuprate https://github.com/Cuprate/cuprate/pull/233#discussion_r1700934928.

RPC payment

The RPC payment system in monerod is a pseudo-deprecated system that allows node operators to be compensated for RPC usage.

Although this system is pseudo-deprecated, monerod still generates related fields in responses. Cuprate follows this behavior.

However, the associated endpoints and actual functionality are not supported by Cuprate. The associated endpoints will return an error upon invocation.

TODO: decide on behavior and document https://github.com/Cuprate/cuprate/pull/233#discussion_r1700870051.

Custom strings

Many JSON response fields contain strings with custom messages.

This may be error messages, status, etc.

Although the field + string type will be followed, Cuprate will not always have the exact same message, particularly when it comes to error messages.

Unsupported RPC calls

TODO: compile unsupported RPC calls after handlers are created.

RPC calls with different behavior

TODO: compile RPC calls with different behavior after handlers are created.

⚪️ ZMQ

TODO

⚪️ Consensus

⚪️ Verifier

⚪️ TODO

⚪️ Networking

⚪️ P2P

⚪️ Dandelion++

⚪️ Proxy

⚪️ Tor

⚪️ i2p

⚪️ IPv4/IPv6

Instrumentation

Cuprate is built with instrumentation in mind.

⚪️ Logging

⚪️ Data collection

⚪️ Binary

⚪️ CLI

⚪️ Config

⚪️ Logging

Resources

⚪️ File system

Index of PATHs

This is an index of all of the filesystem PATHs Cuprate actively uses.

The cuprate_helper::fs module defines the general locations used throughout Cuprate.

dirs is used internally, which follows the PATH standards/conventions on each OS Cuprate supports, i.e.:

the XDG base directory and the XDG user directory specifications on Linux
the Known Folder system on Windows
the Standard Directories on macOS

Cache

Cuprate's cache directory.

OS	PATH
Windows	`C:\Users\Alice\AppData\Local\Cuprate\`
macOS	`/Users/Alice/Library/Caches/Cuprate/`
Linux	`/home/alice/.cache/cuprate/`

Config

Cuprate's config directory.

OS	PATH
Windows	`C:\Users\Alice\AppData\Roaming\Cuprate\`
macOS	`/Users/Alice/Library/Application Support/Cuprate/`
Linux	`/home/alice/.config/cuprate/`

Data

Cuprate's data directory.

OS	PATH
Windows	`C:\Users\Alice\AppData\Roaming\Cuprate\`
macOS	`/Users/Alice/Library/Application Support/Cuprate/`
Linux	`/home/alice/.local/share/cuprate/`

Blockchain

Cuprate's blockchain directory.

OS	PATH
Windows	`C:\Users\Alice\AppData\Roaming\Cuprate\blockchain\`
macOS	`/Users/Alice/Library/Application Support/Cuprate/blockchain/`
Linux	`/home/alice/.local/share/cuprate/blockchain/`

Transaction pool

Cuprate's transaction pool directory.

OS	PATH
Windows	`C:\Users\Alice\AppData\Roaming\Cuprate\txpool\`
macOS	`/Users/Alice/Library/Application Support/Cuprate/txpool/`
Linux	`/home/alice/.local/share/cuprate/txpool/`

Database

Cuprate's database location/filenames depend on:

Which database it is
Which backend is being used

cuprate_blockchain files are in the above mentioned blockchain folder.

cuprate_txpool files are in the above mentioned txpool folder.

If the heed backend is being used, these files will be created:

Filename	Purpose
`data.mdb`	Main data file
`lock.mdb`	Database lock file

For example: /home/alice/.local/share/cuprate/blockchain/lock.mdb.

If the redb backend is being used, these files will be created:

Filename	Purpose
`data.redb`	Main data file

For example: /home/alice/.local/share/cuprate/txpool/data.redb.

Sockets

Index of ports

This is an index of all of the network sockets Cuprate actively uses.

⚪️ Memory

Concurrency and parallelism

It is incumbent upon software like Cuprate to take advantage of today's highly parallel hardware as much as practically possible.

With that said, programs must setup guardrails when operating in a concurrent and parallel manner, for correctness and safety.

There are "synchronization primitives" that help with this, common ones being:

These tools are relatively easy to use in isolation, but trickier to do so when considering the entire system. It is not uncommon for the bottleneck to be the poor orchastration of these primitives.

Analogy

A common analogy for a parallel system is an intersection.

Like a parallel computer system, an intersection contains:

Parallelism: multiple individual units that want to move around (cars, pedestrians, etc)
Synchronization primitives: traffic lights, car lights, walk signals

In theory, the amount of "work" the units can do is only limited by the speed of the units themselves, but in practice, the slow cascading reaction speeds between all units, the frequent hiccups that can occur, and the synchronization primitives themselves become bottlenecks far before the maximum speed of any unit is reached.

A car that hogs the middle of the intersection on the wrong light is akin to a system thread holding onto a lock longer than it should be - it degrades total system output.

Unlike humans however, computer systems at least have the potential to move at lightning speeds, but only if the above synchronization primitives are used correctly.

Goal

To aid the long-term maintenance of highly concurrent and parallel code, this section documents:

All system threads spawned and maintained
All major sections where synchronization primitives are used
The asynchronous behavior of some components

and how these compose together efficiently in Cuprate.

⚪️ Map

⚪️ The RPC server

⚪️ The database

⚪️ The block downloader

⚪️ The verifier

⚪️ Thread exit

Index of threads

This is an index of all of the system threads Cuprate actively uses.

⚪️ External Monero libraries

⚪️ Cryptonight

RandomX

https://github.com/tari-project/randomx-rs

monero_serai

https://github.com/serai-dex/serai/tree/develop/coins/monero

Benchmarking

Cuprate has 2 types of benchmarks:

Criterion benchmarks
cuprate-benchmark benchmarks

Criterion is used for micro benchmarks; they time single functions, groups of functions, and generally are small in scope.

cuprate-benchmark and cuprate-benchmark-lib are custom in-house crates Cuprate uses for macro benchmarks; these test sub-systems, sections of a sub-system, or otherwise larger or more complicated code that isn't well-suited for micro benchmarks.

File layout and purpose

All benchmarking related files are in the benches/ folder.

This directory is organized like such:

Directory	Purpose
`benches/criterion/`	Criterion (micro) benchmarks
`benches/criterion/cuprate-*`	Criterion benchmarks for the crate with the same name
`benches/benchmark/`	Cuprate's custom benchmarking files
`benches/benchmark/bin`	The `cuprate-benchmark` crate; the actual binary run that links all benchmarks
`benches/benchmark/lib`	The `cuprate-benchmark-lib` crate; the benchmarking framework all benchmarks plug into
`benches/benchmark/cuprate-*`	`cuprate-benchmark` benchmarks for the crate with the same name

Criterion

Each sub-directory in benches/criterion/ is a crate that uses Criterion for timing single functions and/or groups of functions.

They are generally be small in scope.

Creating

Creating a new Criterion-based benchmarking crate for one of Cuprate's crates is relatively simple, although, it requires knowledge of how to use Criterion first:

Read the Getting Started section of https://bheisler.github.io/criterion.rs/book
Copy benches/criterion/example as base
Get started

Naming

New benchmark crates using Criterion should:

Be in benches/criterion/
Be in the cuprate-criterion-$CRATE_NAME format

For a real example, see: cuprate-criterion-json-rpc.

Workspace

Finally, make sure to add the benchmark crate to the workspace Cargo.toml file.

Your benchmark is now ready to be ran.

Running

To run all Criterion benchmarks, run this from the repository root:

cargo bench

To run specific package(s), use:

cargo bench --package $CRITERION_BENCHMARK_CRATE_NAME

For example:

cargo bench --package cuprate-criterion-json-rpc

cuprate-benchmark

Cuprate has 2 custom crates for general benchmarking:

cuprate-benchmark; the actual binary crate ran
cuprate-benchmark-lib; the library that other crates hook into

The abstract purpose of cuprate-benchmark is very simple:

Set-up the benchmark
Start timer
Run benchmark
Output data

cuprate-benchmark runs the benchmarks found in benches/benchmark/cuprate-*.

cuprate-benchmark-lib defines the Benchmark trait that all benchmark crates implement to "plug-in" to the benchmarking harness.

Diagram

A diagram displaying the relation between cuprate-benchmark and related crates.

                    ┌─────────────────────┐
                    │ cuprate_benchmark   │
                    │ (actual binary ran) │
                    └──────────┬──────────┘
            ┌──────────────────┴───────────────────┐
            │ cuprate_benchmark_lib                │
            │ ┌───────────────────────────────────┐│
            │ │ trait Benchmark                   ││
            │ └───────────────────────────────────┘│
            └──────────────────┬───────────────────┘
┌───────────────────────────┐  │   ┌───────────────────────────┐
│ cuprate_benchmark_example ├──┼───┤ cuprate_benchmark_*       │
└───────────────────────────┘  │   └───────────────────────────┘
┌───────────────────────────┐  │   ┌───────────────────────────┐
│ cuprate_benchmark_*       ├──┴───┤ cuprate_benchmark_*       │
└───────────────────────────┘      └───────────────────────────┘

Creating

New benchmarks are plugged into cuprate-benchmark by:

Implementing cuprate_benchmark_lib::Benchmark
Registering the benchmark in the cuprate_benchmark binary

See benches/benchmark/example for an example.

Creating the benchmark crate

Before plugging into cuprate-benchmark, your actual benchmark crate must be created:

Create a new crate inside benches/benchmark (consider copying benches/benchmark/example as a base)
Pull in cuprate_benchmark_lib as a dependency
Create a benchmark
Implement cuprate_benchmark_lib::Benchmark

New benchmark crates using cuprate-database should:

Be in benches/benchmark/
Be in the cuprate-benchmark-$CRATE_NAME format

For a real example, see: cuprate-benchmark-database.

`cuprate_benchmark_lib::Benchmark`

This is the trait that standardizes all benchmarks ran under cuprate-benchmark.

It must be implemented by your benchmarking crate.

See cuprate-benchmark-lib crate documentation for a user-guide: https://doc.cuprate.org/cuprate_benchmark_lib.

Adding a feature to `cuprate-benchmark`

After your benchmark's behavior is defined, it must be registered in the binary that is actually ran: cuprate-benchmark.

If your benchmark is new, add a new crate feature to cuprate-benchmark's Cargo.toml file with an optional dependency to your benchmarking crate.

Please remember to edit the feature table in the README.md as well!

Adding to `cuprate-benchmark`'s `main()`

After adding your crate's feature, add a conditional line that run the benchmark if the feature is enabled to the main() function:

For example, if your crate's name is egg:

#![allow(unused)]
fn main() {
cfg_if! {
	if #[cfg(feature = "egg")] {
		run::run_benchmark::<cuprate_benchmark_egg::Benchmark>(&mut timings);
	}
}
}

Workspace

Finally, make sure to add the benchmark crate to the workspace Cargo.toml file.

Your benchmark is now ready to be ran.

Running

cuprate-benchmark benchmarks are ran with this command:

cargo run --release --package cuprate-benchmark --features $BENCHMARK_CRATE_FEATURE

For example, to run the example benchmark:

cargo run --release --package cuprate-benchmark --features example

Use the all feature to run all benchmarks:

# Run all benchmarks
cargo run --release --package cuprate-benchmark --features all

⚪️ Testing

⚪️ Monero data

⚪️ RPC client

⚪️ Spawning monerod

⚪️ Known issues and tradeoffs

⚪️ Networking

⚪️ RPC

⚪️ Storage

Monero oddities

This section is a list of any peculiar, interesting, or non-standard behavior that Monero has that is not planned on being changed or deprecated.

This section exists to hold all the small yet noteworthy knowledge in one place, instead of in any single contributor's mind.

These are usually behaviors stemming from implementation rather than protocol/cryptography.

Formatting

This is the markdown formatting for each entry in this section.

If applicable, consider using this formatting when adding to this section.

# <concise_title_of_the_behavior>

## What
A detailed description of the behavior.

## Expected
The norm or standard behavior that is usually expected.

## Why
The reasoning behind why this behavior exists and/or
any links to more detailed discussion on the behavior.

## Affects
A (potentially non-exhaustive) list of places that this behavior can/does affect.

## Example
An example link or section of code where the behavior occurs.

## Source
A link to original `monerod` code that defines the behavior.

Little-endian IPv4 addresses

What

Monero encodes IPv4 addresses in little-endian byte order.

Expected

In general, networking-related protocols/code use networking order (big-endian).

Why

TODO

Affects

Any representation and (de)serialization of IPv4 addresses must keep little endian in-mind, e.g. the P2P wire format or int encoded IPv4 addresses in RPC.

For example, the ip field in set_bans.

For Cuprate, this means Rust's Ipv4Addr::from_bits/from cannot be used in these cases as it assumes big-endian encoding.

Source

https://github.com/monero-project/monero/blob/893916ad091a92e765ce3241b94e706ad012b62a/contrib/epee/include/net/net_utils_base.h#L97

Appendix

Crates

This is an index of all of Cuprate's in-house crates it uses and maintains.

They are categorized into groups.

Crate documentation for each crate can be found by clicking the crate name or by visiting https://doc.cuprate.org. Documentation can also be built manually by running this at the root of the cuprate repository:

cargo doc --package $CRATE

For example, this will generate and open cuprate-blockchain documentation:

cargo doc --open --package cuprate-blockchain

Consensus

Crate	In-tree path	Purpose
`cuprate-consensus`	`consensus/`	TODO
`cuprate-consensus-context`	`consensus/context/`	TODO
`cuprate-consensus-rules`	`consensus/rules/`	TODO
`cuprate-fast-sync`	`consensus/fast-sync/`	Fast block synchronization

Networking

Crate	In-tree path	Purpose
`cuprate-epee-encoding`	`net/epee-encoding/`	Epee (de)serialization
`cuprate-fixed-bytes`	`net/fixed-bytes/`	Fixed byte containers backed by `byte::Byte`
`cuprate-levin`	`net/levin/`	Levin bucket protocol implementation
`cuprate-wire`	`net/wire/`	TODO

P2P

Crate	In-tree path	Purpose
`cuprate-address-book`	`p2p/address-book/`	TODO
`cuprate-async-buffer`	`p2p/async-buffer/`	A bounded SPSC, FIFO, asynchronous buffer that supports arbitrary weights for values
`cuprate-dandelion-tower`	`p2p/dandelion-tower/`	TODO
`cuprate-p2p`	`p2p/p2p/`	TODO
`cuprate-p2p-bucket`	`p2p/bucket/`	A collection data structure discriminating its items into "buckets" of limited size.
`cuprate-p2p-core`	`p2p/p2p-core/`	TODO

Storage

Crate	In-tree path	Purpose
`cuprate-blockchain`	`storage/blockchain/`	Blockchain database built on-top of `cuprate-database` & `cuprate-database-service`
`cuprate-database`	`storage/database/`	Pure database abstraction
`cuprate-database-service`	`storage/database-service/`	`tower::Service` + thread-pool abstraction built on-top of `cuprate-database`
`cuprate-txpool`	`storage/txpool/`	Transaction pool database built on-top of `cuprate-database` & `cuprate-database-service`

RPC

Crate	In-tree path	Purpose
`cuprate-json-rpc`	`rpc/json-rpc/`	JSON-RPC 2.0 implementation
`cuprate-rpc-types`	`rpc/types/`	Monero RPC types and traits
`cuprate-rpc-interface`	`rpc/interface/`	RPC interface & routing
`cuprate-rpc-handler`	`rpc/handler/`	RPC inner handlers

ZMQ

Crate	In-tree path	Purpose
`cuprate-zmq-types`	`zmq/types/`	Message types for ZMQ Pub/Sub interface

1-off crates

Crate	In-tree path	Purpose
`cuprate-constants`	`constants/`	Shared `const/static` data across Cuprate
`cuprate-cryptonight`	`cryptonight/`	CryptoNight hash functions
`cuprate-pruning`	`pruning/`	Monero pruning logic/types
`cuprate-helper`	`helper/`	Kitchen-sink helper crate for Cuprate
`cuprate-test-utils`	`test-utils/`	Testing utilities for Cuprate
`cuprate-types`	`types/`	Shared types across Cuprate

Benchmarks

Crate	In-tree path	Purpose
`cuprate-benchmark`	`benches/benchmark/bin/`	Cuprate benchmarking binary
`cuprate-benchmark-lib`	`benches/benchmark/lib/`	Cuprate benchmarking library
`cuprate-benchmark-*`	`benches/benchmark/cuprate-*`	Benchmark for a Cuprate crate that uses `cuprate-benchmark`
`cuprate-criterion-*`	`benches/criterion/cuprate-*`	Benchmark for a Cuprate crate that uses Criterion

Contributing

https://github.com/Cuprate/cuprate/blob/main/CONTRIBUTING.md

Build targets

x86
ARM64
Windows
Linux
macOS
FreeBSD(?)

Protocol book

https://monero-book.cuprate.org