Cuprate Architecture

WIP

Cuprate's architecture book.

Sections are notated with colors indicating how complete they are:

ColorMeaning
βšͺ️Empty
πŸ”΄Severely lacking information
🟠Lacking some information
🟑Almost ready
🟒OK

Continue to the next chapter by clicking the right > button, or by selecting it on the left side.

All chapters are viewable by clicking the top-left ☰ button.

The entire book can searched by clicking the top-left πŸ” button.

Last change: 2024-10-02, commit: a003e05

Foreword

Monero1 is a large software project, coming in at 329k lines of C++, C, headers, and make files.2 It is directly responsible for 2.6 billion dollars worth of value.3 It has had over 400 contributors, more if counting unnamed contributions.4 It has over 10,000 node operators and a large active userbase.5

The project wasn't always this big, but somewhere in the midst of contributors coming and going, various features being added, bugs being fixed, and celebrated cryptography being implemented - there was an aspect that was lost by the project that it could not easily gain again: maintainability.

Within large and complicated software projects, there is an important transfer of knowledge that must occur for long-term survival. Much like an organism that must eventually pass the torch onto the next generation, projects must do the same for future contributors.

However, newcomers often lack experience, past contributors might not be around, and current maintainers may be too busy. For whatever reason, this transfer of knowledge is not always smooth.

There is a solution to this problem: documentation.

The activity of writing the what, where, why, and how of the solutions to technical problems can be done in an author's lonesome.

The activity of reading these ideas can be done by future readers at any time without permission.

These readers may be new prospective contributors, it may be the current maintainers, it may be researchers, it may be users of various scale. Whoever it may be, documentation acts as the link between the past and present; a bottle of wisdom thrown into the river of time for future participants to open.

This book is the manifestation of this will, for Cuprate6, an alternative Monero node. It documents Cuprate's implementation from head-to-toe such that in the case of a contributor's untimely disappearance, the project can continue.

People come and go, documentation is forever.

β€” hinto-janai


2

git ls-files | grep "\.cpp$\|\.h$\|\.c$\|CMake" | xargs cat | wc -l on cc73fe7

3

2024-05-24: $143.55 USD * 18,151,608 XMR = $2,605,663,258

4

git log --all --pretty="%an" | sort -u | wc -l on cc73fe7

6

https://github.com/Cuprate/cuprate

Last change: 2024-10-02, commit: a003e05

Intro

Cuprate is an alternative Monero node implementation.

This book describes Cuprate's architecture, ranging from small things like database pruning to larger meta-components like the networking stack.

A brief overview of some aspects covered within this book:

  • Component designs
  • Implementation details
  • File location and purpose
  • Design decisions and tradeoffs
  • Things in relation to monerod
  • Dependency usage

Source code

The source files for this book can be found on at: https://github.com/Cuprate/architecture-book.

Last change: 2024-10-02, commit: a003e05

Who this book is for

Maintainers

As mentioned in Foreword, the group of people that benefit from this book's value the most by far are the current and future Cuprate maintainers.

Cuprate's system design is documented in this book such that if you were ever to build it again from scratch, you would have an excellent guide on how to do such, and also where improvements could be made.

Practically, what that means for maintainers is that it acts as the reference. During maintenance, it is quite valuable to have a book that contains condensed knowledge on the behavior of components, or how certain code works, or why it was built a certain way.

Contributors

Contributors also have access to the inner-workings of Cuprate via this book, which helps when making larger contributions.

Design decisions and implementation details notated in this book helps answer questions such as:

  • Why is it done this way?
  • Why can it not be done this way?
  • Were other methods attempted?

Cuprate's testing and benchmarking suites, unknown to new contributors, are also documented within this book.

Researchers

This book contains the why, where, and how of the implementation of formal research.

Although it is an informal specification, this book still acts as a more accessible overview of Cuprate compared to examining the codebase itself.

Operators & users

This book is not a practical guide for using Cuprate itself.

For configuration, data collection (also important for researchers), and other practical usage, see Cuprate's user book.

Observers

Anyone curious enough is free to learn the inner-workings of Cuprate via this book, and maybe even contribute someday.

Last change: 2024-10-02, commit: a003e05

Required knowledge

General

  • Rust
  • Monero
  • System design

Components

Storage

  • Embedded databases
  • LMDB
  • redb

RPC

  • axum
  • tower
  • async
  • JSON-RPC 2.0
  • Epee

Networking

  • tower
  • tokio
  • async
  • Levin

Instrumentation

  • tracing
Last change: 2024-10-02, commit: a003e05

How to use this book

Maintainers

Contributors

Researchers

Last change: 2024-10-02, commit: a003e05

βšͺ️ Bird's eye view

Last change: 2024-10-02, commit: a003e05

βšͺ️ Map

Last change: 2024-10-02, commit: a003e05

βšͺ️ Components

Last change: 2024-10-02, commit: a003e05

βšͺ️ Formats, protocols, types

Last change: 2024-10-02, commit: a003e05

βšͺ️ monero_serai

Last change: 2024-10-02, commit: a003e05

βšͺ️ cuprate_types

Last change: 2024-10-02, commit: a003e05

βšͺ️ cuprate_helper

Last change: 2024-10-02, commit: a003e05

βšͺ️ Epee

Last change: 2024-10-02, commit: a003e05

βšͺ️ Levin

Last change: 2024-10-02, commit: a003e05

Storage

This section covers all things related to the on-disk storage of data within Cuprate.

Overview

The quick overview is that Cuprate has a database abstraction crate that handles "low-level" database details such as key and value (de)serialization, tables, transactions, etc.

This database abstraction crate is then used by all crates that need on-disk storage, i.e. the

Service

The interface provided by all crates building on-top of the database abstraction is a tower::Service, i.e. database requests/responses are sent/received asynchronously.

As the interface details are similar across crates (threadpool, read operations, write operations), the interface itself is abstracted in the cuprate_database_service crate, which is then used by the crates.

Diagram

This is roughly how database crates are set up.

                                                           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚                 β”‚
β”‚ Some crate that needs a database β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚                 β”‚
β”‚                                  β”‚  β”‚     Public     β”‚   β”‚                 β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” │─►│ tower::Service │◄─►│ Rest of Cuprate β”‚
β”‚ β”‚     Database abstraction     β”‚ β”‚  β”‚      API       β”‚   β”‚                 β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚                 β”‚
                                                           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Last change: 2024-10-02, commit: a003e05

Database abstraction

cuprate_database is Cuprate’s database abstraction.

This crate abstracts various database backends with traits.

All backends have the following attributes:

The currently implemented backends are:

Said precicely, cuprate_database is the embedded database other Cuprate crates interact with instead of using any particular backend implementation. This allows the backend to be swapped and/or future backends to be implemented.

This section will go over cuprate_database details.

Last change: 2024-10-02, commit: a003e05

Abstraction

This next section details how cuprate_database abstracts multiple database backends into 1 API.

Diagram

A simple diagram describing the responsibilities/relationship of cuprate_database.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ cuprate_database                                                      β”‚
β”‚                                                                       β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Database traits           β”‚     β”‚ Backends                        β”‚ β”‚
β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”β”Œβ”€β”€β”€β”€β”€β”€β”β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚     β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚ β”‚ β”‚ Env β”‚β”‚ TxRw β”‚β”‚ ...    β”‚ β”œβ”€β”€β”€β”€β”€β”€ β”‚ heed (LMDB) β”‚ β”‚ redb        β”‚ β”‚ β”‚
β”‚ β”‚ β””β”€β”€β”€β”€β”€β”˜β””β”€β”€β”€β”€β”€β”€β”˜β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚     β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”˜     β””β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚            β”‚             β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜                                β”‚
β”‚            β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚            β”‚         β”‚ Database types         β”‚                       β”‚
β”‚            β”‚         β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”Œβ”€β”€β”€β”€β”€β” β”‚                       β”‚
β”‚            β”‚         β”‚ β”‚ ConcreteEnv β”‚β”‚ ... β”‚ β”‚                       β”‚
β”‚            β”‚         β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β””β”€β”€β”€β”€β”€β”˜ β”‚                       β”‚
β”‚            β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚            β”‚                   β”‚                                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚                   β”‚
             └────────────────────
                                 β”‚
                                 β–Ό
                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚ cuprate_database user β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Last change: 2024-10-02, commit: a003e05

Backend

First, we need an actual database implementation.

cuprate-database's traits allow abstracting over the actual database, such that any backend in particular could be used.

This page is an enumeration of all the backends Cuprate has, has tried, and may try in the future.

heed

The default database used is heed (LMDB). The upstream versions from crates.io are used. LMDB should not need to be installed as heed has a build script that pulls it in automatically.

heed's filenames inside Cuprate's data folder are:

FilenamePurpose
data.mdbMain data file
lock.mdbDatabase lock file

heed-specific notes:

redb

The 2nd database backend is the 100% Rust redb.

The upstream versions from crates.io are used.

redb's filenames inside Cuprate's data folder are:

FilenamePurpose
data.redbMain data file

redb-memory

This backend is 100% the same as redb, although, it uses redb::backend::InMemoryBackend which is a database that completely resides in memory instead of a file.

All other details about this should be the same as the normal redb backend.

sanakirja

sanakirja was a candidate as a backend, however there were problems with maximum value sizes.

The default maximum value size is 1012 bytes which was too small for our requirements. Using sanakirja::Slice and sanakirja::UnsizedStorage was attempted, but there were bugs found when inserting a value in-between 512..=4096 bytes.

As such, it is not implemented.

MDBX

MDBX was a candidate as a backend, however MDBX deprecated the custom key/value comparison functions, this makes it a bit trickier to implement multimap tables. It is also quite similar to the main backend LMDB (of which it was originally a fork of).

As such, it is not implemented (yet).

Last change: 2024-10-02, commit: a003e05

ConcreteEnv

After a backend is selected, the main database environment struct is "abstracted" by putting it in the non-generic, concrete struct ConcreteEnv.

This is the main object used when handling the database directly.

This struct contains all the data necessary to operate the database. The actual database backend ConcreteEnv will use internally depends on which backend feature is used.

ConcreteEnv itself is not too important, what is important is that:

  1. It allows callers to not directly reference any particular backend environment
  2. It implements trait Env which opens the door to all the other database traits

The equivalent "database environment" objects in the backends themselves are:

Last change: 2024-10-02, commit: a003e05

Trait

cuprate_database provides a set of traits that abstract over the various database backends.

This allows the function signatures and behavior to stay the same but allows for swapping out databases in an easier fashion.

All common behavior of the backend's are encapsulated here and used instead of using the backend directly.

Examples:

For example, instead of calling heed or redb's get() function directly, DatabaseRo::get() is called.

Usage

With a ConcreteEnv and a particular backend selected, we can now start using it alongside these traits to start doing database operations in a generic manner.

An example:

#![allow(unused)]
fn main() {
use cuprate_database::{
    ConcreteEnv,
    config::ConfigBuilder,
    Env, EnvInner,
    DatabaseRo, DatabaseRw, TxRo, TxRw,
};

// Initialize the database environment.
let env = ConcreteEnv::open(config)?;

// Open up a transaction + tables for writing.
let env_inner = env.env_inner();
let tx_rw = env_inner.tx_rw()?;
env_inner.create_db::<Table>(&tx_rw)?;

// Write data to the table.
{
	let mut table = env_inner.open_db_rw::<Table>(&tx_rw)?;
	table.put(&0, &1)?;
}

// Commit the transaction.
TxRw::commit(tx_rw)?;
}

As seen above, there is no direct call to heed or redb. Their functionality is abstracted behind ConcreteEnv and the traits.

Last change: 2024-10-02, commit: a003e05

Syncing

cuprate_database's database has 5 disk syncing modes.

  1. FastThenSafe
  2. Safe
  3. Async
  4. Threshold
  5. Fast

The default mode is Safe.

This means that upon each transaction commit, all the data that was written will be fully synced to disk. This is the slowest, but safest mode of operation.

Note that upon any database Drop, the current implementation will sync to disk regardless of any configuration.

For more information on the other modes, read the documentation here.

Last change: 2024-10-02, commit: a003e05

Resizing

cuprate_database itself does not handle memory map resizes automatically (for database backends that need resizing, i.e. heed/LMDB).

When a user directly using cuprate_database, it is up to them on how to resize. The database will return RuntimeError::ResizeNeeded when it needs resizing.

However, cuprate_database exposes some resizing algorithms that define how the database's memory map grows.

Last change: 2024-10-02, commit: a003e05

(De)serialization

All types stored inside the database are either bytes already or are perfectly bitcast-able.

As such, they do not incur heavy (de)serialization costs when storing/fetching them from the database. The main (de)serialization used is bytemuck's traits and casting functions.

Size and layout

The size & layout of types is stable across compiler versions, as they are set and determined with #[repr(C)] and bytemuck's derive macros such as bytemuck::Pod.

Note that the data stored in the tables are still type-safe; we still refer to the key and values within our tables by the type.

How

The main deserialization trait for database storage is Storable.

When a type is casted into bytes, the reference is casted, i.e. this is zero-cost serialization.

However, it is worth noting that when bytes are casted into the type, it is copied. This is due to byte alignment guarantee issues with both backends, see:

Without this, bytemuck will panic with TargetAlignmentGreaterAndInputNotAligned when casting.

Copying the bytes fixes this problem, although it is more costly than necessary. However, in the main use-case for cuprate_database (tower::Service API) the bytes would need to be owned regardless as the Request/Response API uses owned data types (T, Vec<T>, HashMap<K, V>, etc).

Practically speaking, this means lower-level database functions that normally look like such:

#![allow(unused)]
fn main() {
fn get(key: &Key) -> &Value;
}

end up looking like this in cuprate_database:

#![allow(unused)]
fn main() {
fn get(key: &Key) -> Value;
}

Since each backend has its own (de)serialization methods, our types are wrapped in compatibility types that map our Storable functions into whatever is required for the backend, e.g:

Compatibility structs also exist for any Storable containers:

Again, it's unfortunate that these must be owned, although in the tower::Service use-case, they would have to be owned anyway.

Last change: 2024-10-02, commit: a003e05

Known issues and tradeoffs

cuprate_database takes many tradeoffs, whether due to:

  • Prioritizing certain values over others
  • Not having a better solution
  • Being "good enough"

This section is a list of the larger ones, along with issues that don't have answers yet.

Last change: 2024-10-02, commit: a003e05

Traits abstracting backends

Although all database backends used are very similar, they have some crucial differences in small implementation details that must be worked around when conforming them to cuprate_database's traits.

Put simply: using cuprate_database's traits is less efficient and more awkward than using the backend directly.

For example:

This is a tradeoff that cuprate_database takes, as:

  • The backend itself is usually not the source of bottlenecks in the greater system, as such, small inefficiencies are OK
  • None of the lost functionality is crucial for operation
  • The ability to use, test, and swap between multiple database backends is worth it
Last change: 2024-10-02, commit: a003e05

Hot-swappable backends

See also: https://github.com/Cuprate/cuprate/issues/209.

Using a different backend is really as simple as re-building cuprate_database with a different feature flag:

# Use LMDB.
cargo build --package cuprate-database --features heed

# Use redb.
cargo build --package cuprate-database --features redb

This is "good enough" for now, however ideally, this hot-swapping of backends would be able to be done at runtime.

As it is now, cuprate_database cannot compile both backends and swap based on user input at runtime; it must be compiled with a certain backend, which will produce a binary with only that backend.

This also means things like CI testing multiple backends is awkward, as we must re-compile with different feature flags instead.

Last change: 2024-10-02, commit: a003e05

Copying unaligned bytes

As mentioned in (De)serialization, bytes are copied when they are turned into a type T due to unaligned bytes being returned from database backends.

Using a regular reference cast results in an improperly aligned type T; such a type even existing causes undefined behavior. In our case, bytemuck saves us by panicking before this occurs.

Thus, when using cuprate_database's database traits, an owned T is returned.

This is doubly unfortunately for &[u8] as this does not even need deserialization.

For example, StorableVec could have been this:

#![allow(unused)]
fn main() {
enum StorableBytes<'a, T: Storable> {
    Owned(T),
    Ref(&'a T),
}
}

but this would require supporting types that must be copied regardless with the occasional &[u8] that can be returned without casting. This was hard to do so in a generic way, thus all [u8]'s are copied and returned as owned StorableVecs.

This is a tradeoff cuprate_database takes as:

  • bytemuck::pod_read_unaligned is cheap enough
  • The main API, service, needs to return owned value anyway
  • Having no references removes a lot of lifetime complexity

The alternative is somehow fixing the alignment issues in the backends mentioned previously.

Last change: 2024-10-02, commit: a003e05

Endianness

cuprate_database's (de)serialization and storage of bytes are native-endian, as in, byte storage order will depend on the machine it is running on.

As Cuprate's build-targets are all little-endian (big-endian by default machines barely exist), this doesn't matter much and the byte ordering can be seen as a constant.

Practically, this means cuprated's database files can be transferred across computers, as can monerod's.

Last change: 2024-10-02, commit: a003e05

Multimap

cuprate_database does not currently have an abstraction for multimap tables.

All tables are single maps of keys to values.

This matters as this means some of cuprate_blockchain's tables differ from monerod's tables - the primary key is stored for all entries, compared to monerod only needing to store it once:

#![allow(unused)]
fn main() {
// `monerod` only stores `amount: 1` once,
// `cuprated` stores it each time it appears.
struct PreRctOutputId { amount: 1, amount_index: 0 }
struct PreRctOutputId { amount: 1, amount_index: 1 }
}

This means cuprated's database will be slightly larger than monerod's.

The current method cuprate_blockchain uses will be "good enough" as the multimap keys needed for now are fixed, e.g. pre-RCT outputs are no longer being produced.

This may need to change in the future when multimap is all but required, e.g. for FCMP++.

Until then, multimap tables are not implemented as they are tricky to implement across all backends.

Last change: 2024-10-02, commit: a003e05

Common behavior

The crates that build on-top of the database abstraction (cuprate_database) share some common behavior including but not limited to:

  • Defining their specific database tables and types
  • Having an ops module
  • Exposing a tower::Service API (backed by a threadpool) for public usage

This section provides more details on these behaviors.

Last change: 2024-10-02, commit: a003e05

Types

POD types

Since all types in the database are POD types, we must often provide mappings between outside types and the types actually stored in the database.

A common case is mapping infallible types to and from bitflags and/or their raw integer representation. For example, the OutputFlag type or bool types.

As types like enums, bools and chars cannot be casted from an integer infallibly, bytemuck::Pod cannot be implemented on it safely. Thus, we store some infallible version of it inside the database with a custom type and map them when fetching the data.

Lean types

Another reason why database crates define their own types is to cut any unneeded data from the type.

Many of the types used in normal operation (e.g. cuprate_types::VerifiedBlockInformation) contain lots of extra pre-processed data for convenience.

This would be a waste to store in the database, so in this example, the much leaner "raw" BlockInfo type is stored.

Last change: 2024-10-02, commit: a003e05

ops

Both cuprate_blockchain and cuprate_txpool expose an ops module containing abstracted abstracted Monero-related database operations.

For example, cuprate_blockchain::ops::block::add_block.

These functions build on-top of the database traits and allow for more abstracted database operations.

For example, instead of these signatures:

#![allow(unused)]
fn main() {
fn get(_: &Key) -> Value;
fn put(_: &Key, &Value);
}

the ops module provides much higher-level signatures like such:

#![allow(unused)]
fn main() {
fn add_block(block: &Block) -> Result<_, _>;
}

Although these functions are exposed, they are not the main API, that would be next section: the tower::Service (which uses these functions).

Last change: 2024-10-02, commit: a003e05

tower::Service

Both cuprate_blockchain and cuprate_txpool provide async tower::Services that define database requests/responses.

The main API that other Cuprate crates use.

There are 2 tower::Services:

  1. A read service which is backed by a rayon::ThreadPool
  2. A write service which spawns a single thread to handle write requests

As this behavior is the same across all users of cuprate_database, it is extracted into its own crate: cuprate_database_service.

Diagram

As a recap, here is how this looks to a user of a higher-level database crate, cuprate_blockchain in this example. Starting from the lowest layer:

  1. cuprate_database is used to abstract the database
  2. cuprate_blockchain builds on-top of that with tables, types, operations
  3. cuprate_blockchain exposes a tower::Service using cuprate_database_service
  4. The user now interfaces with cuprate_blockchain with that tower::Service in a request/response fashion
                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚ cuprate_database β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ cuprate_blockchain                                                β”‚
β”‚                                                                   β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Tables, types        β”‚  β”‚ ops                                 β”‚ β”‚
β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”Œβ”€β”€β”€β”€β”€β” β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”Œβ”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚ β”‚ β”‚ BlockInfo β”‚β”‚ ... β”‚ β”œβ”€β”€β”€ β”‚ add_block() β”‚ β”‚ add_tx() β”‚β”‚ ... β”‚ β”‚ β”‚
β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β””β”€β”€β”€β”€β”€β”˜ β”‚  β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β””β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                 β”‚                                 β”‚
β”‚                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚                       β”‚ tower::Service                          β”‚ β”‚
β”‚                       β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”Œβ”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚                       β”‚ β”‚ Blockchain{Read,Write}Handle β”‚β”‚ ... β”‚ β”‚ β”‚
β”‚                       β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β””β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                 β”‚                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
		                    β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚ Database requests       β”‚ β”‚ Database responses                    β”‚
       β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
       β”‚ β”‚ FindBlock([u8; 32]) β”‚ β”‚ β”‚ β”‚ FindBlock(Option<(Chain, usize)>) β”‚ β”‚
       β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
       β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
       β”‚ β”‚ ChainHeight         β”‚ β”‚ β”‚ β”‚ ChainHeight(usize, [u8; 32])      β”‚ β”‚
       β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
       β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
       β”‚ β”‚ ...                 β”‚ β”‚ β”‚ β”‚ ...                               β”‚ β”‚
       β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β–²          β”‚
                            β”‚          β–Ό
                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚ cuprate_blockchain user β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Last change: 2024-10-02, commit: a003e05

Initialization

A database service is started simply by calling: init().

This function initializes the database, spawns threads, and returns a:

  • Read handle to the database
  • Write handle to the database
  • The database itself

These handles implement the tower::Service trait, which allows sending requests and receiving responses asynchronously.

Last change: 2024-10-02, commit: a003e05

Requests

Along with the 2 handles, there are 2 types of requests:

Quite obviously:

  • Read requests are for retrieving various data from the database
  • Write requests are for writing data to the database
Last change: 2024-10-02, commit: a003e05

Responses

After sending a request using the read/write handle, the value returned is not the response, yet an asynchronous channel that will eventually return the response:

// Send a request.
//                                   tower::Service::call()
//                                          V
let response_channel: Channel = read_handle.call(BlockchainReadRequest::ChainHeight)?;

// Await the response.
let response: BlockchainReadRequest = response_channel.await?;

After awaiting the returned channel, a Response will eventually be returned when the Service threadpool has fetched the value from the database and sent it off.

Both read/write requests variants match in name with Response variants, i.e.

  • BlockchainReadRequest::ChainHeight leads to BlockchainResponse::ChainHeight
  • BlockchainWriteRequest::WriteBlock leads to BlockchainResponse::WriteBlockOk
Last change: 2024-10-02, commit: a003e05

Resizing

As noted in the cuprate_database resizing section, builders on-top of cuprate_database are responsible for resizing the database.

In cuprate_{blockchain,txpool}'s case, that means the tower::Service must know how to resize. This logic is shared between both crates, defined in cuprate_database_service: https://github.com/Cuprate/cuprate/blob/0941f68efcd7dfe66124ad0c1934277f47da9090/storage/service/src/service/write.rs#L107-L171.

By default, this uses a similar algorithm as monerod's:

There are other resizing algorithms that define how the database's memory map grows, although currently the behavior of monerod is closely followed (for no particular reason).

Last change: 2024-10-02, commit: a003e05

Thread model

The base database abstractions themselves are not concerned with parallelism, they are mostly functions to be called from a single-thread.

However, the cuprate_database_service API, does have a thread model backing it.

When a Service's init() function is called, threads will be spawned and maintained until the user drops (disconnects) the returned handles.

The current behavior for thread count is:

For example, on a system with 32-threads, cuprate_database_service will spawn:

  • 1 writer thread
  • 32 reader threads

whose sole responsibility is to listen for database requests, access the database (potentially in parallel), and return a response.

Note that the 1 system thread = 1 reader thread model is only the default setting, the reader thread count can be configured by the user to be any number between 1 .. amount_of_system_threads.

The reader threads are managed by rayon.

For an example of where multiple reader threads are used: given a request that asks if any key-image within a set already exists, cuprate_blockchain will split that work between the threads with rayon.

Last change: 2024-10-02, commit: a003e05

Shutdown

Once the read/write handles to the tower::Service are Droped, the backing thread(pool) will gracefully exit, automatically.

Note the writer thread and reader threadpool aren't connected whatsoever; dropping the write handle will make the writer thread exit, however, the reader handle is free to be held onto and can be continued to be read from - and vice-versa for the write handle.

Last change: 2024-10-02, commit: a003e05

Blockchain

This section contains storage information specific to cuprate_blockchain, the database built on-top of cuprate_database that stores the blockchain.

Last change: 2024-10-02, commit: a003e05

Schema

This section contains the schema of cuprate_blockchain's database tables.

Last change: 2024-10-02, commit: a003e05

Tables

See also: https://doc.cuprate.org/cuprate_blockchain/tables & https://doc.cuprate.org/cuprate_blockchain/types.

The CamelCase names of the table headers documented here (e.g. TxIds) are the actual type name of the table within cuprate_blockchain.

Note that words written within code blocks mean that it is a real type defined and usable within cuprate_blockchain. Other standard types like u64 and type aliases (TxId) are written normally.

Within cuprate_blockchain::tables, the below table is essentially defined as-is with a macro.

Many of the data types stored are the same data types, although are different semantically, as such, a map of aliases used and their real data types is also provided below.

AliasReal Type
BlockHeight, Amount, AmountIndex, TxId, UnlockTimeu64
BlockHash, KeyImage, TxHash, PrunableHash[u8; 32]

TableKeyValueDescription
BlockHeaderBlobsBlockHeightStorableVec<u8>Maps a block's height to a serialized byte form of its header
BlockTxsHashesBlockHeightStorableVec<[u8; 32]>Maps a block's height to the block's transaction hashes
BlockHeightsBlockHashBlockHeightMaps a block's hash to its height
BlockInfosBlockHeightBlockInfoContains metadata of all blocks
KeyImagesKeyImage()This table is a set with no value, it stores transaction key images
NumOutputsAmountu64Maps an output's amount to the number of outputs with that amount
OutputsPreRctOutputIdOutputThis table contains legacy CryptoNote outputs which have clear amounts. This table will not contain an output with 0 amount.
PrunedTxBlobsTxIdStorableVec<u8>Contains pruned transaction blobs (even if the database is not pruned)
PrunableTxBlobsTxIdStorableVec<u8>Contains the prunable part of a transaction
PrunableHashesTxIdPrunableHashContains the hash of the prunable part of a transaction
RctOutputsAmountIndexRctOutputContains RingCT outputs mapped from their global RCT index
TxBlobsTxIdStorableVec<u8>Serialized transaction blobs (bytes)
TxIdsTxHashTxIdMaps a transaction's hash to its index/ID
TxHeightsTxIdBlockHeightMaps a transaction's ID to the height of the block it comes from
TxOutputsTxIdStorableVec<u64>Gives the amount indices of a transaction's outputs
TxUnlockTimeTxIdUnlockTimeStores the unlock time of a transaction (only if it has a non-zero lock time)
Last change: 2024-10-02, commit: a003e05

Multimap tables

Outputs

When referencing outputs, Monero will use the amount and the amount index. This means 2 keys are needed to reach an output.

With LMDB you can set the DUP_SORT flag on a table and then set the key/value to:

#![allow(unused)]
fn main() {
Key = KEY_PART_1
}
#![allow(unused)]
fn main() {
Value = {
    KEY_PART_2,
    VALUE // The actual value we are storing.
}
}

Then you can set a custom value sorting function that only takes KEY_PART_2 into account; this is how monerod does it.

This requires that the underlying database supports:

  • multimap tables
  • custom sort functions on values
  • setting a cursor on a specific key/value

How cuprate_blockchain does it

Another way to implement this is as follows:

#![allow(unused)]
fn main() {
Key = { KEY_PART_1, KEY_PART_2 }
}
#![allow(unused)]
fn main() {
Value = VALUE
}

Then the key type is simply used to look up the value; this is how cuprate_blockchain does it as cuprate_database does not have a multimap abstraction (yet).

For example, the key/value pair for outputs is:

#![allow(unused)]
fn main() {
PreRctOutputId => Output
}

where PreRctOutputId looks like this:

#![allow(unused)]
fn main() {
struct PreRctOutputId {
    amount: u64,
    amount_index: u64,
}
}
Last change: 2024-10-02, commit: a003e05

βšͺ️ Transaction pool

Last change: 2024-10-02, commit: a003e05

βšͺ️ Pruning

Last change: 2024-10-02, commit: a003e05

RPC

monerod's daemon RPC has three kinds of RPC calls:

  1. JSON-RPC 2.0 methods, called at the /json_rpc endpoint
  2. JSON (but not JSON-RPC 2.0) methods called at their own endpoints, e.g. /get_height
  3. Binary (epee) RPC methods called at their own endpoints ending in .bin, e.g. /get_blocks.bin

Cuprate's RPC aims to mirror monerod's as much as it practically can.

This includes, but is not limited to:

  • Using the same endpoints
  • Receiving the same request data
  • Sending the same response data
  • Responding with the same HTTP status codes
  • Following internal behavior (e.g. /pop_blocks)

Not all monerod behavior can always be followed, however.

Some are not followed on purpose, some cannot be followed due to technical limitations, and some cannot be due to the behavior being monerod specific such as the /set_log_categories endpoint which uses monerod's logging categories.

Both subtle and large differences between Cuprate's RPC and monerod's RPC are documented in the Differences with monerod section.

Main RPC components

The main components that make up Cuprate's RPC are noted below, alongside the equivalent monerod code and other notes.

Cuprate cratemonerod (rough) equivalentPurposeNotes
cuprate-json-rpcjsonrpc_structs.h, http_server_handlers_map2.hJSON-RPC 2.0 implementationmonerod's JSON-RPC 2.0 handling is spread across a few files. The first defines some data structures, the second contains macros that (essentially) implement JSON-RPC 2.0.
cuprate-rpc-typescore_rpc_server_commands_defs.hRPC request/response type definitions and (de)serialization
cuprate-rpc-interfacecore_rpc_server.hRPC interface, routing, endpoints
cuprate-rpc-handlercore_rpc_server.cppRPC request/response handlingThese are the "inner handler" functions that turn requests into responses
Last change: 2024-10-02, commit: a003e05

JSON-RPC 2.0

Cuprate has a standalone crate that implements the JSON-RPC 2.0 specification, cuprate-json-rpc. The RPC methods at the /json_rpc endpoint use this crate's types, functions, and (de)serialization.

There is nothing too special about Cuprate's implementation. Any small notes and differences are noted in the crate documentation.

As such, there is not much to document here, instead, consider reading the very brief JSON-RPC 2.0 specification, and the cuprate-json-rpc crate documentation.

TODO: document method/params vs flattened base when figured out.

Last change: 2024-10-02, commit: a003e05

The types

Cuprate has a crate that defines all the types related to RPC: cuprate-rpc-types.

The main purpose of this crate is to port the types used in monerod's RPC and to re-implement (de)serialization for those types, whether that be JSON, epee, or a custom mix.

The bulk majority of these types are request & response types, i.e. the inputs Cuprate's RPC is expecting from users, and the output it will respond with.

Example

To showcase an example of the kinds of types defined in this crate, here is a request type:

#![allow(unused)]
fn main() {
#[serde(transparent)]
#[repr(transparent)]
struct OnGetBlockHashRequest {
	block_height: [u64; 1],
}
}

This is the input (params) expected in the on_get_block_hash method.

As seen above, the type itself encodes some properties, such as being (de)serialized transparently, and the input being an array with 1 length, rather than a single u64. This is to match the behavior of monerod.

An example JSON form of this type would be:

{
  "jsonrpc": "2.0",
  "id": "0",
  "method": "on_get_block_hash",
  "params": [912345] // <- This can (de)serialize as a `OnGetBlockHashRequest`
}
Last change: 2024-10-02, commit: a003e05

Misc types

Other than the main request/response types, this crate is also responsible for any miscellaneous types used within monerod's RPC.

For example, the status field within many RPC responses is defined within cuprate-rpc-types.

Types that aren't requests/responses but exist within request/response types are also defined in this crate, such as the Distribution structure returned from the get_output_distribution method.

Last change: 2024-10-02, commit: a003e05

Base RPC types

There exists a few "base" types that many types are built on-top of in monerod. These are also implemented in cuprate-rpc-types.

For example, many requests include these 2 fields:

{
  "status": "OK",
  "untrusted": false,
}

This is rpc_response_base in monerod, and ResponseBase in Cuprate.

These types are flattened into other types, i.e. the fields from these base types are injected into the given type. For example, get_block_count's response type is defined like such in Cuprate:

#![allow(unused)]
fn main() {
struct GetBlockCountResponse {
	// The fields of this `base` type are directly
	// injected into `GetBlockCountResponse` during
	// (de)serialization.
	//
	// I.e. it is as if this `base` field were actually these 2 fields:
	// status: Status,
	// untrusted: bool,
    base: ResponseBase,
	count: u64,
}
}

The JSON output of this type would look something like:

{
  "status": "OK",
  "untrusted": "false",
  "count": 993163
}

RPC payment

monerod also contains RPC base types for the RPC payment system. Although the RPC payment system is pseudo deprecated, monerod still generates these fields in responses, and thus, so does Cuprate.

Last change: 2024-10-02, commit: a003e05

The type generator macro

Request and response types make up the majority of cuprate-rpc-types.

  • Request types are the inputs expected from users
  • Response types are what will be outputted to users

Regardless of being meant for JSON-RPC, binary, or a standalone JSON endpoint, all request/response types are defined using the "type generator macro". This macro is important because it defines all request/response types.

This macro:

  • Defines a matching pair of request & response types
  • Implements many derive traits, e.g. Clone on those types
  • Implements both serde and epee on those types
  • Automates documentation, tests, etc.

See here for example usage of this macro.

Last change: 2024-10-02, commit: a003e05

Metadata

cuprate-rpc-types also provides some traits to access some metadata surrounding RPC data types.

For example, trait RpcCall allows accessing whether an RPC request is restricted or not.

monerod has a boolean permission system. RPC calls can be restricted or not. If an RPC call is restricted, it will only be allowed on un-restricted RPC servers (18081). If an RPC call is not restricted, it will be allowed on all RPC server types (18081 & 18089).

This metadata is used in crates that build upon cuprate-rpc-types, e.g. to know if an RPC call should be allowed through or not.

Last change: 2024-10-02, commit: a003e05

(De)serialization

A crucial responsibility of cuprate-rpc-types is to provide the correct (de)serialization of types.

The input/output of Cuprate's RPC should match monerod (as much as practically possible).

A simple example of this is that /get_height should respond with the exact same data for both monerod and Cuprate:

{
  "hash": "7e23a28cfa6df925d5b63940baf60b83c0cbb65da95f49b19e7cf0ce7dd709ce",
  "height": 2287217,
  "status": "OK",
  "untrusted": false
}

Behavior would be considered incompatible if any of the following were true:

  • Fields are missing
  • Extra fields exist
  • Field types are incorrect (string instead of number, etc)

JSON

(De)serialization for JSON is implemented using serde and serde_json.

cuprate-rpc-interface (the main crate responsible for the actual output) uses serde_json for JSON formatting. It is mostly the same formatting as monerod, although there are slight differences.

Technically, the formatting of the JSON output is not handled by cuprate-rpc-types, users are free to choose whatever formatting they desire.

Epee

(De)serialization for the epee binary format is handled by Cuprate's in-house cuprate-epee-encoding library.

Bitcasted structs

TODO: https://github.com/monero-project/monero/issues/9422

Compressed data

TODO: https://github.com/monero-project/monero/issues/9422

Last change: 2024-10-02, commit: a003e05

The interface

This section is short as cuprate-rpc-interface contains detailed documentation.

The RPC interface, which includes:

  • Endpoint routing (/json_rpc, /get_blocks.bin, etc)
  • Route function signatures (async fn json_rpc(...) -> Response)
  • Type (de)serialization
  • Any miscellaneous handling (denying restricted RPC calls)

is handled by the cuprate-rpc-interface crate.

Essentially, this crate provides the API for the RPC.

cuprate-rpc-interface is built on-top of axum and tower, which are the crates doing the bulk majority of the work.

Request -> Response

The functions that map requests to responses are not implemented by cuprate-rpc-interface itself, they must be provided by the user, i.e. it can be customized.

In Rust terms, this crate provides you with:

#![allow(unused)]
fn main() {
async fn json_rpc(
	state: State,
	request: Request,
) -> Response {
	/* your handler here */
}
}

and you provide the function body.

The main handler crate is cuprate-rpc-handler. This crate implements the standard RPC behavior, i.e. it mostly mirrors monerod.

Although, it's worth noting that other implementations are possible, such as an RPC handler that caches blocks, or an RPC handler that only accepts certain endpoints, or any combination.

Last change: 2024-10-02, commit: a003e05

The handler

TODO: fill after cuprate-rpc-handler is created.

Last change: 2024-10-02, commit: a003e05

πŸ”΄ The server

TODO: fill after cuprate-rpc-server or binary impl is created.

Last change: 2024-10-02, commit: a003e05

Differences with monerod

As noted in the introduction, monerod's RPC behavior cannot always be perfectly followed by Cuprate.

The reasoning for the differences can vary from:

  • Technical limitations
  • Behavior being monerod-specific
  • Purposeful decision to not support behavior

This section lays out the details of the differences between monerod's and Cuprate's RPC system.

Last change: 2024-10-02, commit: a003e05

JSON field ordering

When serializing JSON, monerod has the behavior to order key fields within a scope alphabetically.

For example:

{
  "id": "0",
  "jsonrpc": "2.0",
  "result": {
    "blockhashing_blob": "...",
    "blocktemplate_blob": "...",
    "difficulty": 283305047039,
    "difficulty_top64": 0,
    "expected_reward": 600000000000,
    "height": 3195018,
    "next_seed_hash": "",
    "prev_hash": "9d648e741d85ca0e7acb4501f051b27e9b107d3cd7a3f03aa7f776089117c81a",
    "reserved_offset": 131,
    "seed_hash": "e2aa0b7b55042cd48b02e395d78fa66a29815ccc1584e38db2d1f0e8485cd44f",
    "seed_height": 3194880,
    "status": "OK",
    "untrusted": false,
    "wide_difficulty": "0x41f64bf3ff"
  }
}

In the main {}, id comes before jsonrpc, which comes before result.

The same alphabetical ordering is applied to the fields within result.

Cuprate uses serde for JSON serialization, which serializes fields based on the definition order, i.e. whatever order the fields are defined in the code, is the order they will appear in JSON.

Some struct fields within Cuprate's RPC types happen to be alphabetical, but this is not a guarantee.

As these are JSON maps, the ordering of fields should not matter, although this is something to note as the output will technically differ.

Example incompatibility

An example of where this leads to incompatibility is if specific line numbers are depended on to contain specific fields.

For example, this will print the 10th line:

curl http://127.0.0.1:18081/json_rpc -d '{"jsonrpc":"2.0","id":"0","method":"get_block_template","params":{"wallet_address":"44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns","reserve_size":60}' -H 'Content-Type: application/json' | sed -n 10p

It will be "height": 3195018 in monerod's case, but may not necessarily be for Cuprate.

By all means, this should not be relied upon in the first place, although it is shown as an example.

Last change: 2024-10-02, commit: a003e05

JSON formatting

In general, Cuprate's JSON formatting is very similar to monerod, but there are some differences.

This is a list of those differences.

Pretty vs compact

TODO: decide when handlers are created if we should allow custom formatting.

Cuprate's RPC (really, serde_json) can be configured to use either:

monerod uses something similar to pretty formatting.

As an example, pretty formatting:

{
  "number": 1,
  "array": [
    0,
    1
  ],
  "string": "",
  "array_of_objects": [
    {
      "x": 1.0,
      "y": -1.0
    },
    {
      "x": 2.0,
      "y": -2.0
    }
  ]
}

compact formatting:

{"number":1,"array":[0,1],"string":"","array_of_objects":[{"x":1.0,"y":-1.0},{"x":2.0,"y":-2.0}]}

Array of objects

monerod will format an array of objects like such:

{
  "array_of_objects": [{
    "x": 0.0,
    "y": 0.0,
  },{
    "x": 0.0,
    "y": 0.0,
  },{
    "x": 0.0,
    "y": 0.0
  }]
}

Cuprate will format the above like such:

{
  "array_of_objects": [
    {
      "x": 0.0,
      "y": 0.0,
    },
    {
      "x": 0.0,
      "y": 0.0,
    },
    {
      "x": 0.0,
      "y": 0.0
    }
  ]
}

Array of maps containing named objects

An method that contains outputs like this is the peers field in the sync_info method:

curl \
    http://127.0.0.1:18081/json_rpc \
    -d '{"jsonrpc":"2.0","id":"0","method":"sync_info"}' \
    -H 'Content-Type: application/json'

monerod will format an array of maps that contains named objects like such:

{
  "array": [{
    "named_object": {
      "field": ""
    }
  },{
    "named_object": {
      "field": ""
    }
  }]
}

Cuprate will format the above like such:

{
  "array": [
    {
      "named_object": {
        "field": ""
      }
    },
    {
      "named_object": {
        "field": ""
      }
    }
  ]
}
Last change: 2024-10-02, commit: a003e05

JSON strictness

This is a list of behavior that monerod's JSON parser allows, that Cuprate's JSON parser (serde_json) does not.

In general, monerod's parser is quite lenient, allowing invalid JSON in many cases. Cuprate's (really, serde_json) JSON parser is quite strict, essentially sticking to the JSON specification.

Cuprate also makes some decisions that are different than monerod, but are not necessarily more or less strict.

Missing closing bracket

monerod will accept JSON missing a final closing }.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","id":"0","method":"get_block_count"' \
	-H 'Content-Type: application/json'

Trailing ending comma

monerod will accept JSON containing a final trailing ,.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","id":"0","method":"get_block_count",}' \
	-H 'Content-Type: application/json'

Allowing - in fields

monerod allows - as a valid value in certain fields, not a string "-", but the character -.

The fields where this is allowed seems to be any field monerod does not explicitly look for, examples include:

  • jsonrpc
  • id
  • params (where parameters are not expected)
  • Any ignored field

The JSON-RPC 2.0 specification does state that the response id should be null upon errors in detecting the request id, although in this case, this is invalid JSON and should not make it this far. The response will contain the default id: 0 in this case.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":-,"id":-,"params":-,"IGNORED_FIELD":-,"method":"get_block_count"}' \
	-H 'Content-Type: application/json'
Last change: 2024-10-02, commit: a003e05

JSON-RPC strictness

This is a list of behavior that monerod's JSON-RPC implementation allows, that Cuprate's JSON-RPC implementation does not.

In general, monerod's JSON-RPC is quite lenient, going against the specification in many cases. Cuprate's JSON-RPC implementation is slightly more strict.

Cuprate also makes some decisions that are different than monerod, but are not necessarily more or less strict.

Allowing an incorrect jsonrpc field

The JSON-RPC 2.0 specification states that the jsonrpc field must be exactly "2.0".

monerod allows jsonrpc to:

  • Be any string
  • Be an empty array
  • Be null
  • Not exist at all

Examples:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"???","method":"get_block_count"}' \
	-H 'Content-Type: application/json'
curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":[],"method":"get_block_count"}' \
	-H 'Content-Type: application/json'
curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":null,"method":"get_block_count"}' \
	-H 'Content-Type: application/json'
curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

Allowing id to be any type

JSON-RPC 2.0 responses must contain the same id as the original request.

However, the specification states:

An identifier established by the Client that MUST contain a String, Number, or NULL value if included

monerod does not check this and allows id to be any JSON type, for example, a map:

curl \
    http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","id":{"THIS":{"IS":"ALLOWED"}},"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

The response:

{
  "id": {
    "THIS": {
      "IS": "ALLOWED"
    }
  },
  "jsonrpc": "2.0",
  "result": {
    "count": 3210225,
    "status": "OK",
    "untrusted": false
  }
}

Responding with id:0 on error

The JSON-RPC specification states:

If there was an error in detecting the id in the Request object (e.g. Parse error/Invalid Request), it MUST be Null.

Although, monerod will respond with id:0 in these cases.

curl \
    http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","id":asdf,"method":"get_block_count"}' \
	-H 'Content-Type: application/json'

Response:

{
  "error": {
    "code": -32700,
    "message": "Parse error"
  },
  "id": 0,
  "jsonrpc": "2.0"
}

Responding to notifications

TODO: decide on Cuprate behavior https://github.com/Cuprate/cuprate/pull/233#discussion_r1704611186

Requests that have no id field are "notifications".

The JSON-RPC 2.0 specification states that requests without an id field must not be responded to.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsonrpc":"2.0","method":"get_block_count"}' \
	-H 'Content-Type: application/json'

Upper/mixed case fields

monerod will accept upper/mixed case fields on:

  • jsonrpc
  • id

method however, is checked.

The JSON-RPC 2.0 specification does not outright state what case to support, although, Cuprate only supports lowercase as supporting upper/mixed case is more code to add as serde by default is case-sensitive on struct fields.

Example:

curl \
	http://127.0.0.1:18081/json_rpc \
	-d '{"jsONrPc":"2.0","iD":0,"method":"get_block_count"}' \
	-H 'Content-Type: application/json'
Last change: 2024-10-02, commit: a003e05

HTTP methods

monerod endpoints supports multiple HTTP methods that do not necessarily make sense.

For example:

curl \
	http://127.0.0.1:18081/get_limit \
	-H 'Content-Type: application/json' \
	--request DELETE

This is sending an HTTP DELETE request, which should be a GET.

monerod will respond to this the same as GET, POST, PUT, and TRACE.

Cuprate's behavior

TODO: decide allowed HTTP methods for Cuprate https://github.com/Cuprate/cuprate/pull/233#discussion_r1700934928.

Last change: 2024-10-02, commit: a003e05

RPC payment

The RPC payment system in monerod is a pseudo-deprecated system that allows node operators to be compensated for RPC usage.

Although this system is pseudo-deprecated, monerod still generates related fields in responses. Cuprate follows this behavior.

However, the associated endpoints and actual functionality are not supported by Cuprate. The associated endpoints will return an error upon invocation.

TODO: decide on behavior and document https://github.com/Cuprate/cuprate/pull/233#discussion_r1700870051.

Last change: 2024-10-02, commit: a003e05

Custom strings

Many JSON response fields contain strings with custom messages.

This may be error messages, status, etc.

Although the field + string type will be followed, Cuprate will not always have the exact same message, particularly when it comes to error messages.

Last change: 2024-10-02, commit: a003e05

Unsupported RPC calls

TODO: compile unsupported RPC calls after handlers are created.

Last change: 2024-10-02, commit: a003e05

RPC calls with different behavior

TODO: compile RPC calls with different behavior after handlers are created.

Last change: 2024-10-02, commit: a003e05

βšͺ️ ZMQ

Last change: 2024-10-02, commit: a003e05

TODO

Last change: 2024-10-02, commit: a003e05

βšͺ️ Consensus

Last change: 2024-10-02, commit: a003e05

βšͺ️ Verifier

Last change: 2024-10-02, commit: a003e05

βšͺ️ TODO

Last change: 2024-10-02, commit: a003e05

βšͺ️ Networking

Last change: 2024-10-02, commit: a003e05

βšͺ️ P2P

Last change: 2024-10-02, commit: a003e05

βšͺ️ Dandelion++

Last change: 2024-10-02, commit: a003e05

βšͺ️ Proxy

Last change: 2024-10-02, commit: a003e05

βšͺ️ Tor

Last change: 2024-10-02, commit: a003e05

βšͺ️ i2p

Last change: 2024-10-02, commit: a003e05

βšͺ️ IPv4/IPv6

Last change: 2024-10-02, commit: a003e05

Instrumentation

Cuprate is built with instrumentation in mind.

Last change: 2024-10-02, commit: a003e05

βšͺ️ Logging

Last change: 2024-10-02, commit: a003e05

βšͺ️ Data collection

Last change: 2024-10-02, commit: a003e05

βšͺ️ Binary

Last change: 2024-10-02, commit: a003e05

βšͺ️ CLI

Last change: 2024-10-02, commit: a003e05

βšͺ️ Config

Last change: 2024-10-02, commit: a003e05

βšͺ️ Logging

Last change: 2024-10-02, commit: a003e05

Resources

Last change: 2024-10-02, commit: a003e05

βšͺ️ File system

Last change: 2024-10-02, commit: a003e05

Index of PATHs

This is an index of all of the filesystem PATHs Cuprate actively uses.

The cuprate_helper::fs module defines the general locations used throughout Cuprate.

dirs is used internally, which follows the PATH standards/conventions on each OS Cuprate supports, i.e.:

Cache

Cuprate's cache directory.

OSPATH
WindowsC:\Users\Alice\AppData\Local\Cuprate\
macOS/Users/Alice/Library/Caches/Cuprate/
Linux/home/alice/.cache/cuprate/

Config

Cuprate's config directory.

OSPATH
WindowsC:\Users\Alice\AppData\Roaming\Cuprate\
macOS/Users/Alice/Library/Application Support/Cuprate/
Linux/home/alice/.config/cuprate/

Data

Cuprate's data directory.

OSPATH
WindowsC:\Users\Alice\AppData\Roaming\Cuprate\
macOS/Users/Alice/Library/Application Support/Cuprate/
Linux/home/alice/.local/share/cuprate/

Blockchain

Cuprate's blockchain directory.

OSPATH
WindowsC:\Users\Alice\AppData\Roaming\Cuprate\blockchain\
macOS/Users/Alice/Library/Application Support/Cuprate/blockchain/
Linux/home/alice/.local/share/cuprate/blockchain/

Transaction pool

Cuprate's transaction pool directory.

OSPATH
WindowsC:\Users\Alice\AppData\Roaming\Cuprate\txpool\
macOS/Users/Alice/Library/Application Support/Cuprate/txpool/
Linux/home/alice/.local/share/cuprate/txpool/

Database

Cuprate's database location/filenames depend on:

  • Which database it is
  • Which backend is being used

cuprate_blockchain files are in the above mentioned blockchain folder.

cuprate_txpool files are in the above mentioned txpool folder.


If the heed backend is being used, these files will be created:

FilenamePurpose
data.mdbMain data file
lock.mdbDatabase lock file

For example: /home/alice/.local/share/cuprate/blockchain/lock.mdb.

If the redb backend is being used, these files will be created:

FilenamePurpose
data.redbMain data file

For example: /home/alice/.local/share/cuprate/txpool/data.redb.

Last change: 2024-10-02, commit: a003e05

Sockets

Last change: 2024-10-02, commit: a003e05

Index of ports

This is an index of all of the network sockets Cuprate actively uses.

Last change: 2024-10-02, commit: a003e05

βšͺ️ Memory

Last change: 2024-10-02, commit: a003e05

Concurrency and parallelism

It is incumbent upon software like Cuprate to take advantage of today's highly parallel hardware as much as practically possible.

With that said, programs must setup guardrails when operating in a concurrent and parallel manner, for correctness and safety.

There are "synchronization primitives" that help with this, common ones being:

These tools are relatively easy to use in isolation, but trickier to do so when considering the entire system. It is not uncommon for the bottleneck to be the poor orchastration of these primitives.

Analogy

A common analogy for a parallel system is an intersection.

Like a parallel computer system, an intersection contains:

  1. Parallelism: multiple individual units that want to move around (cars, pedestrians, etc)
  2. Synchronization primitives: traffic lights, car lights, walk signals

In theory, the amount of "work" the units can do is only limited by the speed of the units themselves, but in practice, the slow cascading reaction speeds between all units, the frequent hiccups that can occur, and the synchronization primitives themselves become bottlenecks far before the maximum speed of any unit is reached.

A car that hogs the middle of the intersection on the wrong light is akin to a system thread holding onto a lock longer than it should be - it degrades total system output.

Unlike humans however, computer systems at least have the potential to move at lightning speeds, but only if the above synchronization primitives are used correctly.

Goal

To aid the long-term maintenance of highly concurrent and parallel code, this section documents:

  1. All system threads spawned and maintained
  2. All major sections where synchronization primitives are used
  3. The asynchronous behavior of some components

and how these compose together efficiently in Cuprate.

Last change: 2024-10-02, commit: a003e05

βšͺ️ Map

Last change: 2024-10-02, commit: a003e05

βšͺ️ The RPC server

Last change: 2024-10-02, commit: a003e05

βšͺ️ The database

Last change: 2024-10-02, commit: a003e05

βšͺ️ The block downloader

Last change: 2024-10-02, commit: a003e05

βšͺ️ The verifier

Last change: 2024-10-02, commit: a003e05

βšͺ️ Thread exit

Last change: 2024-10-02, commit: a003e05

Index of threads

This is an index of all of the system threads Cuprate actively uses.

Last change: 2024-10-02, commit: a003e05

βšͺ️ External Monero libraries

Last change: 2024-10-02, commit: a003e05

βšͺ️ Cryptonight

Last change: 2024-10-02, commit: a003e05

RandomX

https://github.com/tari-project/randomx-rs

Last change: 2024-10-02, commit: a003e05

monero_serai

https://github.com/serai-dex/serai/tree/develop/coins/monero

Last change: 2024-10-02, commit: a003e05

βšͺ️ Benchmarking

Last change: 2024-10-02, commit: a003e05

βšͺ️ Criterion

Last change: 2024-10-02, commit: a003e05

βšͺ️ Harness

Last change: 2024-10-02, commit: a003e05

βšͺ️ Testing

Last change: 2024-10-02, commit: a003e05

βšͺ️ Monero data

Last change: 2024-10-02, commit: a003e05

βšͺ️ RPC client

Last change: 2024-10-02, commit: a003e05

βšͺ️ Spawning monerod

Last change: 2024-10-02, commit: a003e05

βšͺ️ Known issues and tradeoffs

Last change: 2024-10-02, commit: a003e05

βšͺ️ Networking

Last change: 2024-10-02, commit: a003e05

βšͺ️ RPC

Last change: 2024-10-02, commit: a003e05

βšͺ️ Storage

Last change: 2024-10-02, commit: a003e05

Appendix

Last change: 2024-10-02, commit: a003e05

Crates

This is an index of all of Cuprate's in-house crates it uses and maintains.

They are categorized into groups.

Crate documentation for each crate can be found by clicking the crate name or by visiting https://doc.cuprate.org. Documentation can also be built manually by running this at the root of the cuprate repository:

cargo doc --package $CRATE

For example, this will generate and open cuprate-blockchain documentation:

cargo doc --open --package cuprate-blockchain

Consensus

Networking

CrateIn-tree pathPurpose
cuprate-epee-encodingnet/epee-encoding/Epee (de)serialization
cuprate-fixed-bytesnet/fixed-bytes/Fixed byte containers backed by byte::Byte
cuprate-levinnet/levin/Levin bucket protocol implementation
cuprate-wirenet/wire/TODO

P2P

CrateIn-tree pathPurpose
cuprate-address-bookp2p/address-book/TODO
cuprate-async-bufferp2p/async-buffer/A bounded SPSC, FIFO, asynchronous buffer that supports arbitrary weights for values
cuprate-dandelion-towerp2p/dandelion-tower/TODO
cuprate-p2pp2p/p2p/TODO
cuprate-p2p-corep2p/p2p-core/TODO

Storage

CrateIn-tree pathPurpose
cuprate-blockchainstorage/blockchain/Blockchain database built on-top of cuprate-database & cuprate-database-service
cuprate-databasestorage/database/Pure database abstraction
cuprate-database-servicestorage/database-service/tower::Service + thread-pool abstraction built on-top of cuprate-database
cuprate-txpoolstorage/txpool/Transaction pool database built on-top of cuprate-database & cuprate-database-service

RPC

CrateIn-tree pathPurpose
cuprate-json-rpcrpc/json-rpc/JSON-RPC 2.0 implementation
cuprate-rpc-typesrpc/types/Monero RPC types and traits
cuprate-rpc-interfacerpc/interface/RPC interface & routing
cuprate-rpc-handlerrpc/handler/RPC inner handlers

1-off crates

CrateIn-tree pathPurpose
cuprate-constantsconstants/Shared const/static data across Cuprate
cuprate-cryptonightcryptonight/CryptoNight hash functions
cuprate-pruningpruning/Monero pruning logic/types
cuprate-helperhelper/Kitchen-sink helper crate for Cuprate
cuprate-test-utilstest-utils/Testing utilities for Cuprate
cuprate-typestypes/Shared types across Cuprate
Last change: 2024-10-02, commit: a003e05

Contributing

https://github.com/Cuprate/cuprate/blob/main/CONTRIBUTING.md

Last change: 2024-10-02, commit: a003e05

Build targets

  • x86
  • ARM64
  • Windows
  • Linux
  • macOS
  • FreeBSD(?)
Last change: 2024-10-02, commit: a003e05

Protocol book

https://monero-book.cuprate.org

Last change: 2024-10-02, commit: a003e05

βšͺ️ User book

Last change: 2024-10-02, commit: a003e05