Cuprate Architecture
WIP
Cuprate's architecture book.
Sections are notated with colors indicating how complete they are:
Color | Meaning |
---|---|
βͺοΈ | Empty |
π΄ | Severely lacking information |
π | Lacking some information |
π‘ | Almost ready |
π’ | OK |
Continue to the next chapter by clicking the right >
button, or by selecting it on the left side.
All chapters are viewable by clicking the top-left β°
button.
The entire book can searched by clicking the top-left π button.
Foreword
Monero1 is a large software project, coming in at 329k lines of C++, C, headers, and make files.2 It is directly responsible for 2.6 billion dollars worth of value.3 It has had over 400 contributors, more if counting unnamed contributions.4 It has over 10,000 node operators and a large active userbase.5
The project wasn't always this big, but somewhere in the midst of contributors coming and going, various features being added, bugs being fixed, and celebrated cryptography being implemented - there was an aspect that was lost by the project that it could not easily gain again: maintainability.
Within large and complicated software projects, there is an important transfer of knowledge that must occur for long-term survival. Much like an organism that must eventually pass the torch onto the next generation, projects must do the same for future contributors.
However, newcomers often lack experience, past contributors might not be around, and current maintainers may be too busy. For whatever reason, this transfer of knowledge is not always smooth.
There is a solution to this problem: documentation.
The activity of writing the what, where, why, and how of the solutions to technical problems can be done in an author's lonesome.
The activity of reading these ideas can be done by future readers at any time without permission.
These readers may be new prospective contributors, it may be the current maintainers, it may be researchers, it may be users of various scale. Whoever it may be, documentation acts as the link between the past and present; a bottle of wisdom thrown into the river of time for future participants to open.
This book is the manifestation of this will, for Cuprate6, an alternative Monero node. It documents Cuprate's implementation from head-to-toe such that in the case of a contributor's untimely disappearance, the project can continue.
People come and go, documentation is forever.
β hinto-janai
git ls-files | grep "\.cpp$\|\.h$\|\.c$\|CMake" | xargs cat | wc -l
on cc73fe7
2024-05-24: $143.55 USD * 18,151,608 XMR = $2,605,663,258
git log --all --pretty="%an" | sort -u | wc -l
on cc73fe7
Intro
Cuprate is an alternative Monero node implementation.
This book describes Cuprate's architecture, ranging from small things like database pruning to larger meta-components like the networking stack.
A brief overview of some aspects covered within this book:
- Component designs
- Implementation details
- File location and purpose
- Design decisions and tradeoffs
- Things in relation to
monerod
- Dependency usage
Source code
The source files for this book can be found on at: https://github.com/Cuprate/architecture-book.
Who this book is for
Maintainers
As mentioned in Foreword
, the group of people that benefit from this book's value the most by far are the current and future Cuprate maintainers.
Cuprate's system design is documented in this book such that if you were ever to build it again from scratch, you would have an excellent guide on how to do such, and also where improvements could be made.
Practically, what that means for maintainers is that it acts as the reference. During maintenance, it is quite valuable to have a book that contains condensed knowledge on the behavior of components, or how certain code works, or why it was built a certain way.
Contributors
Contributors also have access to the inner-workings of Cuprate via this book, which helps when making larger contributions.
Design decisions and implementation details notated in this book helps answer questions such as:
- Why is it done this way?
- Why can it not be done this way?
- Were other methods attempted?
Cuprate's testing and benchmarking suites, unknown to new contributors, are also documented within this book.
Researchers
This book contains the why, where, and how of the implementation of formal research.
Although it is an informal specification, this book still acts as a more accessible overview of Cuprate compared to examining the codebase itself.
Operators & users
This book is not a practical guide for using Cuprate itself.
For configuration, data collection (also important for researchers), and other practical usage, see Cuprate's user book.
Observers
Anyone curious enough is free to learn the inner-workings of Cuprate via this book, and maybe even contribute someday.
Required knowledge
General
- Rust
- Monero
- System design
Components
Storage
- Embedded databases
- LMDB
- redb
RPC
axum
tower
async
- JSON-RPC 2.0
- Epee
Networking
tower
tokio
async
- Levin
Instrumentation
tracing
How to use this book
Maintainers
Contributors
Researchers
βͺοΈ Bird's eye view
βͺοΈ Map
βͺοΈ Components
βͺοΈ Formats, protocols, types
βͺοΈ monero_serai
βͺοΈ cuprate_types
βͺοΈ cuprate_helper
βͺοΈ Epee
βͺοΈ Levin
Storage
This section covers all things related to the on-disk storage of data within Cuprate.
Overview
The quick overview is that Cuprate has a database abstraction crate that handles "low-level" database details such as key and value (de)serialization, tables, transactions, etc.
This database abstraction crate is then used by all crates that need on-disk storage, i.e. the
Service
The interface provided by all crates building on-top of the
database abstraction is a tower::Service
, i.e.
database requests/responses are sent/received asynchronously.
As the interface details are similar across crates (threadpool, read operations, write operations),
the interface itself is abstracted in the cuprate_database_service
crate,
which is then used by the crates.
Diagram
This is roughly how database crates are set up.
βββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββ β β
β Some crate that needs a database β ββββββββββββββββββ β β
β β β Public β β β
β ββββββββββββββββββββββββββββββββ βββΊβ tower::Service ββββΊβ Rest of Cuprate β
β β Database abstraction β β β API β β β
β ββββββββββββββββββββββββββββββββ β ββββββββββββββββββ β β
ββββββββββββββββββββββββββββββββββββ β β
βββββββββββββββββββ
Database abstraction
cuprate_database
is Cuprateβs database abstraction.
This crate abstracts various database backends with trait
s.
All backends have the following attributes:
- Embedded
- Multiversion concurrency control
- ACID
- Are
(key, value)
oriented and have the expected API (get()
,insert()
,delete()
) - Are table oriented (
"table_name" -> (key, value)
) - Allows concurrent readers
The currently implemented backends are:
Said precicely, cuprate_database
is the embedded database other Cuprate
crates interact with instead of using any particular backend implementation.
This allows the backend to be swapped and/or future backends to be implemented.
This section will go over cuprate_database
details.
Abstraction
This next section details how cuprate_database
abstracts multiple database backends into 1 API.
Diagram
A simple diagram describing the responsibilities/relationship of cuprate_database
.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β cuprate_database β
β β
β βββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ β
β β Database traits β β Backends β β
β β βββββββββββββββββββββββββ β β βββββββββββββββ βββββββββββββββ β β
β β β Env ββ TxRw ββ ... β βββββββ€ β heed (LMDB) β β redb β β β
β β βββββββββββββββββββββββββ β β βββββββββββββββ βββββββββββββββ β β
β ββββββββββββ¬ββββββββββββββ¬βββ ββββ¬βββββββββββββββββββββββββββββββ β
β β βββββββ¬ββββββ β
β β βββββββββββ΄βββββββββββββββ β
β β β Database types β β
β β β ββββββββββββββββββββββ β β
β β β β ConcreteEnv ββ ... β β β
β β β ββββββββββββββββββββββ β β
β β βββββββββββ¬βββββββββββββββ β
β β β β
ββββββββββββββΌββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β β
βββββββββββββββββββββ€
β
βΌ
βββββββββββββββββββββββββ
β cuprate_database user β
βββββββββββββββββββββββββ
Backend
First, we need an actual database implementation.
cuprate-database
's trait
s allow abstracting over the actual database, such that any backend in particular could be used.
This page is an enumeration of all the backends Cuprate has, has tried, and may try in the future.
heed
The default database used is heed
(LMDB). The upstream versions from crates.io
are used. LMDB
should not need to be installed as heed
has a build script that pulls it in automatically.
heed
's filenames inside Cuprate's data folder are:
Filename | Purpose |
---|---|
data.mdb | Main data file |
lock.mdb | Database lock file |
heed
-specific notes:
- There is a maximum reader limit. Other potential processes (e.g.
xmrblocks
) that are also reading thedata.mdb
file need to be accounted for - LMDB does not work on remote filesystem
redb
The 2nd database backend is the 100% Rust redb
.
The upstream versions from crates.io
are used.
redb
's filenames inside Cuprate's data folder are:
Filename | Purpose |
---|---|
data.redb | Main data file |
redb-memory
This backend is 100% the same as redb
, although, it uses redb::backend::InMemoryBackend
which is a database that completely resides in memory instead of a file.
All other details about this should be the same as the normal redb
backend.
sanakirja
sanakirja
was a candidate as a backend, however there were problems with maximum value sizes.
The default maximum value size is 1012 bytes which was too small for our requirements. Using sanakirja::Slice
and sanakirja::UnsizedStorage was attempted, but there were bugs found when inserting a value in-between 512..=4096
bytes.
As such, it is not implemented.
MDBX
MDBX
was a candidate as a backend, however MDBX deprecated the custom key/value comparison functions, this makes it a bit trickier to implement multimap tables. It is also quite similar to the main backend LMDB (of which it was originally a fork of).
As such, it is not implemented (yet).
ConcreteEnv
After a backend is selected, the main database environment struct is "abstracted" by putting it in the non-generic, concrete struct ConcreteEnv
.
This is the main object used when handling the database directly.
This struct contains all the data necessary to operate the database.
The actual database backend ConcreteEnv
will use internally depends on which backend feature is used.
ConcreteEnv
itself is not too important, what is important is that:
- It allows callers to not directly reference any particular backend environment
- It implements
trait Env
which opens the door to all the other database traits
The equivalent "database environment" objects in the backends themselves are:
Trait
cuprate_database
provides a set of trait
s that abstract over the various database backends.
This allows the function signatures and behavior to stay the same but allows for swapping out databases in an easier fashion.
All common behavior of the backend's are encapsulated here and used instead of using the backend directly.
Examples:
For example, instead of calling heed
or redb
's get()
function directly, DatabaseRo::get()
is called.
Usage
With a ConcreteEnv
and a particular backend selected,
we can now start using it alongside these traits to start
doing database operations in a generic manner.
An example:
#![allow(unused)] fn main() { use cuprate_database::{ ConcreteEnv, config::ConfigBuilder, Env, EnvInner, DatabaseRo, DatabaseRw, TxRo, TxRw, }; // Initialize the database environment. let env = ConcreteEnv::open(config)?; // Open up a transaction + tables for writing. let env_inner = env.env_inner(); let tx_rw = env_inner.tx_rw()?; env_inner.create_db::<Table>(&tx_rw)?; // Write data to the table. { let mut table = env_inner.open_db_rw::<Table>(&tx_rw)?; table.put(&0, &1)?; } // Commit the transaction. TxRw::commit(tx_rw)?; }
As seen above, there is no direct call to heed
or redb
.
Their functionality is abstracted behind ConcreteEnv
and the trait
s.
Syncing
cuprate_database
's database has 5 disk syncing modes.
FastThenSafe
Safe
Async
Threshold
Fast
The default mode is Safe
.
This means that upon each transaction commit, all the data that was written will be fully synced to disk. This is the slowest, but safest mode of operation.
Note that upon any database Drop
, the current implementation will sync to disk regardless of any configuration.
For more information on the other modes, read the documentation here.
Resizing
cuprate_database
itself does not handle memory map resizes automatically
(for database backends that need resizing, i.e. heed/LMDB).
When a user directly using cuprate_database
, it is up to them on how to resize. The database will return RuntimeError::ResizeNeeded
when it needs resizing.
However, cuprate_database
exposes some resizing algorithms
that define how the database's memory map grows.
(De)serialization
All types stored inside the database are either bytes already or are perfectly bitcast-able.
As such, they do not incur heavy (de)serialization costs when storing/fetching them from the database. The main (de)serialization used is bytemuck
's traits and casting functions.
Size and layout
The size & layout of types is stable across compiler versions, as they are set and determined with #[repr(C)]
and bytemuck
's derive macros such as bytemuck::Pod
.
Note that the data stored in the tables are still type-safe; we still refer to the key and values within our tables by the type.
How
The main deserialization trait
for database storage is Storable
.
- Before storage, the type is simply cast into bytes
- When fetching, the bytes are simply cast into the type
When a type is casted into bytes, the reference is casted, i.e. this is zero-cost serialization.
However, it is worth noting that when bytes are casted into the type, it is copied. This is due to byte alignment guarantee issues with both backends, see:
Without this, bytemuck
will panic with TargetAlignmentGreaterAndInputNotAligned
when casting.
Copying the bytes fixes this problem, although it is more costly than necessary. However, in the main use-case for cuprate_database
(tower::Service
API) the bytes would need to be owned regardless as the Request/Response
API uses owned data types (T
, Vec<T>
, HashMap<K, V>
, etc).
Practically speaking, this means lower-level database functions that normally look like such:
#![allow(unused)] fn main() { fn get(key: &Key) -> &Value; }
end up looking like this in cuprate_database
:
#![allow(unused)] fn main() { fn get(key: &Key) -> Value; }
Since each backend has its own (de)serialization methods, our types are wrapped in compatibility types that map our Storable
functions into whatever is required for the backend, e.g:
Compatibility structs also exist for any Storable
containers:
Again, it's unfortunate that these must be owned, although in the tower::Service
use-case, they would have to be owned anyway.
Known issues and tradeoffs
cuprate_database
takes many tradeoffs, whether due to:
- Prioritizing certain values over others
- Not having a better solution
- Being "good enough"
This section is a list of the larger ones, along with issues that don't have answers yet.
Traits abstracting backends
Although all database backends used are very similar, they have some crucial differences in small implementation details that must be worked around when conforming them to cuprate_database
's traits.
Put simply: using cuprate_database
's traits is less efficient and more awkward than using the backend directly.
For example:
- Data types must be wrapped in compatibility layers when they otherwise wouldn't be
- There are types that only apply to a specific backend, but are visible to all
- There are extra layers of abstraction to smoothen the differences between all backends
- Existing functionality of backends must be taken away, as it isn't supported in the others
This is a tradeoff that cuprate_database
takes, as:
- The backend itself is usually not the source of bottlenecks in the greater system, as such, small inefficiencies are OK
- None of the lost functionality is crucial for operation
- The ability to use, test, and swap between multiple database backends is worth it
Hot-swappable backends
Using a different backend is really as simple as re-building cuprate_database
with a different feature flag:
# Use LMDB.
cargo build --package cuprate-database --features heed
# Use redb.
cargo build --package cuprate-database --features redb
This is "good enough" for now, however ideally, this hot-swapping of backends would be able to be done at runtime.
As it is now, cuprate_database
cannot compile both backends and swap based on user input at runtime; it must be compiled with a certain backend, which will produce a binary with only that backend.
This also means things like CI testing multiple backends is awkward, as we must re-compile with different feature flags instead.
Copying unaligned bytes
As mentioned in (De)serialization
, bytes are copied when they are turned into a type T
due to unaligned bytes being returned from database backends.
Using a regular reference cast results in an improperly aligned type T
; such a type even existing causes undefined behavior. In our case, bytemuck
saves us by panicking before this occurs.
Thus, when using cuprate_database
's database traits, an owned T
is returned.
This is doubly unfortunately for &[u8]
as this does not even need deserialization.
For example, StorableVec
could have been this:
#![allow(unused)] fn main() { enum StorableBytes<'a, T: Storable> { Owned(T), Ref(&'a T), } }
but this would require supporting types that must be copied regardless with the occasional &[u8]
that can be returned without casting. This was hard to do so in a generic way, thus all [u8]
's are copied and returned as owned StorableVec
s.
This is a tradeoff cuprate_database
takes as:
bytemuck::pod_read_unaligned
is cheap enough- The main API,
service
, needs to return owned value anyway - Having no references removes a lot of lifetime complexity
The alternative is somehow fixing the alignment issues in the backends mentioned previously.
Endianness
cuprate_database
's (de)serialization and storage of bytes are native-endian, as in, byte storage order will depend on the machine it is running on.
As Cuprate's build-targets are all little-endian (big-endian by default machines barely exist), this doesn't matter much and the byte ordering can be seen as a constant.
Practically, this means cuprated
's database files can be transferred across computers, as can monerod
's.
Multimap
cuprate_database
does not currently have an abstraction for multimap tables.
All tables are single maps of keys to values.
This matters as this means some of cuprate_blockchain
's tables differ from monerod
's tables - the primary key is stored for all entries, compared to monerod
only needing to store it once:
#![allow(unused)] fn main() { // `monerod` only stores `amount: 1` once, // `cuprated` stores it each time it appears. struct PreRctOutputId { amount: 1, amount_index: 0 } struct PreRctOutputId { amount: 1, amount_index: 1 } }
This means cuprated
's database will be slightly larger than monerod
's.
The current method cuprate_blockchain
uses will be "good enough" as the multimap
keys needed for now are fixed, e.g. pre-RCT outputs are no longer being produced.
This may need to change in the future when multimap is all but required, e.g. for FCMP++.
Until then, multimap tables are not implemented as they are tricky to implement across all backends.
Common behavior
The crates that build on-top of the database abstraction (cuprate_database
)
share some common behavior including but not limited to:
- Defining their specific database tables and types
- Having an
ops
module - Exposing a
tower::Service
API (backed by a threadpool) for public usage
This section provides more details on these behaviors.
Types
POD types
Since all types in the database are POD types, we must often provide mappings between outside types and the types actually stored in the database.
A common case is mapping infallible types to and from bitflags
and/or their raw integer representation.
For example, the OutputFlag
type or bool
types.
As types like enum
s, bool
s and char
s cannot be casted from an integer infallibly,
bytemuck::Pod
cannot be implemented on it safely. Thus, we store some infallible version
of it inside the database with a custom type and map them when fetching the data.
Lean types
Another reason why database crates define their own types is to cut any unneeded data from the type.
Many of the types used in normal operation (e.g. cuprate_types::VerifiedBlockInformation
) contain lots of extra pre-processed data for convenience.
This would be a waste to store in the database, so in this example, the much leaner
"raw" BlockInfo
type is stored.
ops
Both cuprate_blockchain
and cuprate_txpool
expose an
ops
module containing abstracted abstracted Monero-related database operations.
For example, cuprate_blockchain::ops::block::add_block
.
These functions build on-top of the database traits and allow for more abstracted database operations.
For example, instead of these signatures:
#![allow(unused)] fn main() { fn get(_: &Key) -> Value; fn put(_: &Key, &Value); }
the ops
module provides much higher-level signatures like such:
#![allow(unused)] fn main() { fn add_block(block: &Block) -> Result<_, _>; }
Although these functions are exposed, they are not the main API, that would be next section:
the tower::Service
(which uses these functions).
tower::Service
Both cuprate_blockchain
and cuprate_txpool
provide
async
tower::Service
s that define database requests/responses.
The main API that other Cuprate crates use.
There are 2 tower::Service
s:
- A read service which is backed by a
rayon::ThreadPool
- A write service which spawns a single thread to handle write requests
As this behavior is the same across all users of cuprate_database
,
it is extracted into its own crate: cuprate_database_service
.
Diagram
As a recap, here is how this looks to a user of a higher-level database crate,
cuprate_blockchain
in this example. Starting from the lowest layer:
cuprate_database
is used to abstract the databasecuprate_blockchain
builds on-top of that with tables, types, operationscuprate_blockchain
exposes atower::Service
usingcuprate_database_service
- The user now interfaces with
cuprate_blockchain
with thattower::Service
in a request/response fashion
ββββββββββββββββββββ
β cuprate_database β
ββββββββββ¬ββββββββββ
βββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββ
β cuprate_blockchain β
β β
β ββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ β
β β Tables, types β β ops β β
β β ββββββββββββββββββββ β β βββββββββββββββ βββββββββββββββββββ β β
β β β BlockInfo ββ ... β ββββ€ β add_block() β β add_tx() ββ ... β β β
β β ββββββββββββββββββββ β β βββββββββββββββ βββββββββββββββββββ β β
β ββββββββββββββββββββββββ βββββββ¬ββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββ΄ββββββββββββββββββββββββββββββββ β
β β tower::Service β β
β β βββββββββββββββββββββββββββββββββββββββ β β
β β β Blockchain{Read,Write}Handle ββ ... β β β
β β βββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββ¬ββββββββββββββββββββββββββββββββ β
β β β
βββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β
βββββββ΄ββββββ
ββββββββββββββββββββββ΄βββββ ββββββ΄βββββββββββββββββββββββββββββββββββ
β Database requests β β Database responses β
β βββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββββββββ β
β β FindBlock([u8; 32]) β β β β FindBlock(Option<(Chain, usize)>) β β
β βββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββββββββ β
β βββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββββββββ β
β β ChainHeight β β β β ChainHeight(usize, [u8; 32]) β β
β βββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββββββββ β
β βββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββββββββ β
β β ... β β β β ... β β
β βββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββ
β² β
β βΌ
βββββββββββββββββββββββββββ
β cuprate_blockchain user β
βββββββββββββββββββββββββββ
Initialization
A database service is started simply by calling: init()
.
This function initializes the database, spawns threads, and returns a:
- Read handle to the database
- Write handle to the database
- The database itself
These handles implement the tower::Service
trait, which allows sending requests and receiving responses async
hronously.
Requests
Along with the 2 handles, there are 2 types of requests:
- Read requests, e.g.
BlockchainReadRequest
- Write requests, e.g.
BlockchainWriteRequest
Quite obviously:
- Read requests are for retrieving various data from the database
- Write requests are for writing data to the database
Responses
After sending a request using the read/write handle, the value returned is not the response, yet an async
hronous channel that will eventually return the response:
// Send a request.
// tower::Service::call()
// V
let response_channel: Channel = read_handle.call(BlockchainReadRequest::ChainHeight)?;
// Await the response.
let response: BlockchainReadRequest = response_channel.await?;
After await
ing the returned channel, a Response
will eventually be returned when
the Service
threadpool has fetched the value from the database and sent it off.
Both read/write requests variants match in name with Response
variants, i.e.
BlockchainReadRequest::ChainHeight
leads toBlockchainResponse::ChainHeight
BlockchainWriteRequest::WriteBlock
leads toBlockchainResponse::WriteBlockOk
Resizing
As noted in the cuprate_database
resizing section,
builders on-top of cuprate_database
are responsible for resizing the database.
In cuprate_{blockchain,txpool}
's case, that means the tower::Service
must know
how to resize. This logic is shared between both crates, defined in cuprate_database_service
:
https://github.com/Cuprate/cuprate/blob/0941f68efcd7dfe66124ad0c1934277f47da9090/storage/service/src/service/write.rs#L107-L171.
By default, this uses a similar algorithm as monerod
's:
- If there's not enough space to fit a write request's data, start a resize
- Each resize adds around
1,073,745,920
bytes to the current map size - A resize will be attempted
3
times before failing
There are other resizing algorithms that define how the database's memory map grows, although currently the behavior of monerod
is closely followed (for no particular reason).
Thread model
The base database abstractions themselves are not concerned with parallelism, they are mostly functions to be called from a single-thread.
However, the cuprate_database_service
API, does have a thread model backing it.
When a Service
's init() function is called, threads will be spawned and
maintained until the user drops (disconnects) the returned handles.
The current behavior for thread count is:
For example, on a system with 32-threads, cuprate_database_service
will spawn:
- 1 writer thread
- 32 reader threads
whose sole responsibility is to listen for database requests, access the database (potentially in parallel), and return a response.
Note that the 1 system thread = 1 reader thread
model is only the default setting, the reader thread count can be configured by the user to be any number between 1 .. amount_of_system_threads
.
The reader threads are managed by rayon
.
For an example of where multiple reader threads are used: given a request that asks if any key-image within a set already exists, cuprate_blockchain
will split that work between the threads with rayon
.
Shutdown
Once the read/write handles to the tower::Service
are Drop
ed, the backing thread(pool) will gracefully exit, automatically.
Note the writer thread and reader threadpool aren't connected whatsoever; dropping the write handle will make the writer thread exit, however, the reader handle is free to be held onto and can be continued to be read from - and vice-versa for the write handle.
Blockchain
This section contains storage information specific to cuprate_blockchain
,
the database built on-top of cuprate_database
that stores the blockchain.
Schema
This section contains the schema of cuprate_blockchain
's database tables.
Tables
See also: https://doc.cuprate.org/cuprate_blockchain/tables & https://doc.cuprate.org/cuprate_blockchain/types.
The CamelCase
names of the table headers documented here (e.g. TxIds
) are the actual type name of the table within cuprate_blockchain
.
Note that words written within code blocks
mean that it is a real type defined and usable within cuprate_blockchain
. Other standard types like u64 and type aliases (TxId) are written normally.
Within cuprate_blockchain::tables
, the below table is essentially defined as-is with a macro.
Many of the data types stored are the same data types, although are different semantically, as such, a map of aliases used and their real data types is also provided below.
Alias | Real Type |
---|---|
BlockHeight, Amount, AmountIndex, TxId, UnlockTime | u64 |
BlockHash, KeyImage, TxHash, PrunableHash | [u8; 32] |
Table | Key | Value | Description |
---|---|---|---|
BlockHeaderBlobs | BlockHeight | StorableVec<u8> | Maps a block's height to a serialized byte form of its header |
BlockTxsHashes | BlockHeight | StorableVec<[u8; 32]> | Maps a block's height to the block's transaction hashes |
BlockHeights | BlockHash | BlockHeight | Maps a block's hash to its height |
BlockInfos | BlockHeight | BlockInfo | Contains metadata of all blocks |
KeyImages | KeyImage | () | This table is a set with no value, it stores transaction key images |
NumOutputs | Amount | u64 | Maps an output's amount to the number of outputs with that amount |
Outputs | PreRctOutputId | Output | This table contains legacy CryptoNote outputs which have clear amounts. This table will not contain an output with 0 amount. |
PrunedTxBlobs | TxId | StorableVec<u8> | Contains pruned transaction blobs (even if the database is not pruned) |
PrunableTxBlobs | TxId | StorableVec<u8> | Contains the prunable part of a transaction |
PrunableHashes | TxId | PrunableHash | Contains the hash of the prunable part of a transaction |
RctOutputs | AmountIndex | RctOutput | Contains RingCT outputs mapped from their global RCT index |
TxBlobs | TxId | StorableVec<u8> | Serialized transaction blobs (bytes) |
TxIds | TxHash | TxId | Maps a transaction's hash to its index/ID |
TxHeights | TxId | BlockHeight | Maps a transaction's ID to the height of the block it comes from |
TxOutputs | TxId | StorableVec<u64> | Gives the amount indices of a transaction's outputs |
TxUnlockTime | TxId | UnlockTime | Stores the unlock time of a transaction (only if it has a non-zero lock time) |
Multimap tables
Outputs
When referencing outputs, Monero will use the amount and the amount index. This means 2 keys are needed to reach an output.
With LMDB you can set the DUP_SORT
flag on a table and then set the key/value to:
#![allow(unused)] fn main() { Key = KEY_PART_1 }
#![allow(unused)] fn main() { Value = { KEY_PART_2, VALUE // The actual value we are storing. } }
Then you can set a custom value sorting function that only takes KEY_PART_2
into account; this is how monerod
does it.
This requires that the underlying database supports:
- multimap tables
- custom sort functions on values
- setting a cursor on a specific key/value
How cuprate_blockchain
does it
Another way to implement this is as follows:
#![allow(unused)] fn main() { Key = { KEY_PART_1, KEY_PART_2 } }
#![allow(unused)] fn main() { Value = VALUE }
Then the key type is simply used to look up the value; this is how cuprate_blockchain
does it
as cuprate_database
does not have a multimap abstraction (yet).
For example, the key/value pair for outputs is:
#![allow(unused)] fn main() { PreRctOutputId => Output }
where PreRctOutputId
looks like this:
#![allow(unused)] fn main() { struct PreRctOutputId { amount: u64, amount_index: u64, } }
βͺοΈ Transaction pool
βͺοΈ Pruning
RPC
monerod
's daemon RPC has three kinds of RPC calls:
- JSON-RPC 2.0 methods, called at the
/json_rpc
endpoint - JSON (but not JSON-RPC 2.0) methods called at their own endpoints, e.g.
/get_height
- Binary (epee) RPC methods called at their own endpoints ending in
.bin
, e.g./get_blocks.bin
Cuprate's RPC aims to mirror monerod
's as much as it practically can.
This includes, but is not limited to:
- Using the same endpoints
- Receiving the same request data
- Sending the same response data
- Responding with the same HTTP status codes
- Following internal behavior (e.g.
/pop_blocks
)
Not all monerod
behavior can always be followed, however.
Some are not followed on purpose, some cannot be followed due to technical limitations, and some cannot be due to the behavior being monerod
specific such as the /set_log_categories
endpoint which uses monerod
's logging categories.
Both subtle and large differences between Cuprate's RPC and monerod
's RPC are documented in the Differences with monerod
section.
Main RPC components
The main components that make up Cuprate's RPC are noted below, alongside the equivalent monerod
code and other notes.
Cuprate crate | monerod (rough) equivalent | Purpose | Notes |
---|---|---|---|
cuprate-json-rpc | jsonrpc_structs.h , http_server_handlers_map2.h | JSON-RPC 2.0 implementation | monerod 's JSON-RPC 2.0 handling is spread across a few files. The first defines some data structures, the second contains macros that (essentially) implement JSON-RPC 2.0. |
cuprate-rpc-types | core_rpc_server_commands_defs.h | RPC request/response type definitions and (de)serialization | |
cuprate-rpc-interface | core_rpc_server.h | RPC interface, routing, endpoints | |
cuprate-rpc-handler | core_rpc_server.cpp | RPC request/response handling | These are the "inner handler" functions that turn requests into responses |
JSON-RPC 2.0
Cuprate has a standalone crate that implements the JSON-RPC 2.0 specification, cuprate-json-rpc
. The RPC methods at the /json_rpc
endpoint use this crate's types, functions, and (de)serialization.
There is nothing too special about Cuprate's implementation. Any small notes and differences are noted in the crate documentation.
As such, there is not much to document here, instead, consider reading the very
brief JSON-RPC 2.0 specification, and the cuprate-json-rpc
crate documentation.
TODO: document
method/params
vs flattenedbase
when figured out.
The types
Cuprate has a crate that defines all the types related to RPC: cuprate-rpc-types
.
The main purpose of this crate is to port the types used in monerod
's RPC and to re-implement
(de)serialization for those types, whether that be JSON, epee
, or a custom mix.
The bulk majority of these types are request & response types, i.e. the inputs Cuprate's RPC is expecting from users, and the output it will respond with.
Example
To showcase an example of the kinds of types defined in this crate, here is a request type:
#![allow(unused)] fn main() { #[serde(transparent)] #[repr(transparent)] struct OnGetBlockHashRequest { block_height: [u64; 1], } }
This is the input (params
) expected in the on_get_block_hash
method.
As seen above, the type itself encodes some properties, such as being (de)serialized transparently, and the input being an array with 1 length, rather than a single u64
. This is to match the behavior of monerod
.
An example JSON form of this type would be:
{
"jsonrpc": "2.0",
"id": "0",
"method": "on_get_block_hash",
"params": [912345] // <- This can (de)serialize as a `OnGetBlockHashRequest`
}
Misc types
Other than the main request/response types, this crate is also responsible
for any miscellaneous types used within monerod
's RPC.
For example, the status
field within many RPC responses is defined within
cuprate-rpc-types
.
Types that aren't requests/responses but exist within request/response
types are also defined in this crate, such as the
Distribution
structure returned from the get_output_distribution
method.
Base RPC types
There exists a few "base" types that many types are built on-top of in monerod
.
These are also implemented in cuprate-rpc-types
.
For example, many requests include these 2 fields:
{
"status": "OK",
"untrusted": false,
}
This is rpc_response_base
in monerod
, and ResponseBase
in Cuprate.
These types are flattened into other types, i.e. the fields
from these base types are injected into the given type. For example, get_block_count
's response type is defined like such in Cuprate:
#![allow(unused)] fn main() { struct GetBlockCountResponse { // The fields of this `base` type are directly // injected into `GetBlockCountResponse` during // (de)serialization. // // I.e. it is as if this `base` field were actually these 2 fields: // status: Status, // untrusted: bool, base: ResponseBase, count: u64, } }
The JSON output of this type would look something like:
{
"status": "OK",
"untrusted": "false",
"count": 993163
}
RPC payment
monerod
also contains RPC base types for the RPC payment system. Although the RPC payment system is pseudo deprecated, monerod
still generates these fields in responses, and thus, so does Cuprate.
The type generator macro
Request and response types make up the majority of cuprate-rpc-types
.
- Request types are the inputs expected from users
- Response types are what will be outputted to users
Regardless of being meant for JSON-RPC, binary, or a standalone JSON endpoint, all request/response types are defined using the "type generator macro". This macro is important because it defines all request/response types.
This macro:
- Defines a matching pair of request & response types
- Implements many
derive
traits, e.g.Clone
on those types - Implements both
serde
andepee
on those types - Automates documentation, tests, etc.
See here for example usage of this macro.
Metadata
cuprate-rpc-types
also provides
some trait
s to access some metadata surrounding RPC data types.
For example, trait RpcCall
allows accessing whether an RPC request is restricted
or not.
monerod
has a boolean permission system. RPC calls can be restricted or not.
If an RPC call is restricted, it will only be allowed on un-restricted RPC servers (18081
).
If an RPC call is not restricted, it will be allowed on all RPC server types (18081
& 18089
).
This metadata is used in crates that build upon cuprate-rpc-types
, e.g.
to know if an RPC call should be allowed through or not.
(De)serialization
A crucial responsibility of cuprate-rpc-types
is to provide the correct (de)serialization of types.
The input/output of Cuprate's RPC should match monerod
(as much as practically possible).
A simple example of this is that /get_height
should respond with the exact same data for both monerod
and Cuprate:
{
"hash": "7e23a28cfa6df925d5b63940baf60b83c0cbb65da95f49b19e7cf0ce7dd709ce",
"height": 2287217,
"status": "OK",
"untrusted": false
}
Behavior would be considered incompatible if any of the following were true:
- Fields are missing
- Extra fields exist
- Field types are incorrect (
string
instead ofnumber
, etc)
JSON
(De)serialization for JSON is implemented using serde
and serde_json
.
cuprate-rpc-interface
(the main crate responsible
for the actual output) uses serde_json
for JSON formatting. It is mostly the same formatting as monerod
, although there are slight differences.
Technically, the formatting of the JSON output is not handled by cuprate-rpc-types
, users are free to choose whatever formatting they desire.
Epee
(De)serialization for the epee binary format is handled by Cuprate's in-house cuprate-epee-encoding library.
Bitcasted struct
s
Compressed data
The interface
This section is short as
cuprate-rpc-interface
contains detailed documentation.
The RPC interface, which includes:
- Endpoint routing (
/json_rpc
,/get_blocks.bin
, etc) - Route function signatures (
async fn json_rpc(...) -> Response
) - Type (de)serialization
- Any miscellaneous handling (denying
restricted
RPC calls)
is handled by the cuprate-rpc-interface
crate.
Essentially, this crate provides the API for the RPC.
cuprate-rpc-interface
is built on-top of axum
and tower
,
which are the crates doing the bulk majority of the work.
Request -> Response
The functions that map requests to responses are not implemented by cuprate-rpc-interface
itself, they must be provided by the user, i.e. it can be customized.
In Rust terms, this crate provides you with:
#![allow(unused)] fn main() { async fn json_rpc( state: State, request: Request, ) -> Response { /* your handler here */ } }
and you provide the function body.
The main handler crate is cuprate-rpc-handler
.
This crate implements the standard RPC behavior, i.e. it mostly mirrors monerod
.
Although, it's worth noting that other implementations are possible, such as an RPC handler that caches blocks, or an RPC handler that only accepts certain endpoints, or any combination.
The handler
TODO: fill after
cuprate-rpc-handler
is created.
π΄ The server
TODO: fill after
cuprate-rpc-server
or binary impl is created.
Differences with monerod
As noted in the introduction, monerod
's RPC behavior cannot always be perfectly followed by Cuprate.
The reasoning for the differences can vary from:
- Technical limitations
- Behavior being
monerod
-specific - Purposeful decision to not support behavior
This section lays out the details of the differences between monerod
's and Cuprate's RPC system.
JSON field ordering
When serializing JSON, monerod
has the behavior to order key fields within a scope alphabetically.
For example:
{
"id": "0",
"jsonrpc": "2.0",
"result": {
"blockhashing_blob": "...",
"blocktemplate_blob": "...",
"difficulty": 283305047039,
"difficulty_top64": 0,
"expected_reward": 600000000000,
"height": 3195018,
"next_seed_hash": "",
"prev_hash": "9d648e741d85ca0e7acb4501f051b27e9b107d3cd7a3f03aa7f776089117c81a",
"reserved_offset": 131,
"seed_hash": "e2aa0b7b55042cd48b02e395d78fa66a29815ccc1584e38db2d1f0e8485cd44f",
"seed_height": 3194880,
"status": "OK",
"untrusted": false,
"wide_difficulty": "0x41f64bf3ff"
}
}
In the main {}
, id
comes before jsonrpc
, which comes before result
.
The same alphabetical ordering is applied to the fields within result
.
Cuprate uses serde
for JSON serialization,
which serializes fields based on the definition order, i.e. whatever
order the fields are defined in the code, is the order they will appear
in JSON.
Some struct
fields within Cuprate's RPC types happen to be alphabetical, but this is not a guarantee.
As these are JSON maps, the ordering of fields should not matter, although this is something to note as the output will technically differ.
Example incompatibility
An example of where this leads to incompatibility is if specific line numbers are depended on to contain specific fields.
For example, this will print the 10th line:
curl http://127.0.0.1:18081/json_rpc -d '{"jsonrpc":"2.0","id":"0","method":"get_block_template","params":{"wallet_address":"44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns","reserve_size":60}' -H 'Content-Type: application/json' | sed -n 10p
It will be "height": 3195018
in monerod
's case, but may not necessarily be for Cuprate.
By all means, this should not be relied upon in the first place, although it is shown as an example.
JSON formatting
In general, Cuprate's JSON formatting is very similar to monerod
, but there are some differences.
This is a list of those differences.
Pretty vs compact
TODO: decide when handlers are created if we should allow custom formatting.
Cuprate's RPC (really, serde_json
) can be configured to use either:
monerod
uses something similar to pretty formatting.
As an example, pretty formatting:
{
"number": 1,
"array": [
0,
1
],
"string": "",
"array_of_objects": [
{
"x": 1.0,
"y": -1.0
},
{
"x": 2.0,
"y": -2.0
}
]
}
compact formatting:
{"number":1,"array":[0,1],"string":"","array_of_objects":[{"x":1.0,"y":-1.0},{"x":2.0,"y":-2.0}]}
Array of objects
monerod
will format an array of objects like such:
{
"array_of_objects": [{
"x": 0.0,
"y": 0.0,
},{
"x": 0.0,
"y": 0.0,
},{
"x": 0.0,
"y": 0.0
}]
}
Cuprate will format the above like such:
{
"array_of_objects": [
{
"x": 0.0,
"y": 0.0,
},
{
"x": 0.0,
"y": 0.0,
},
{
"x": 0.0,
"y": 0.0
}
]
}
Array of maps containing named objects
An method that contains outputs like this is the peers
field in the sync_info
method:
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":"2.0","id":"0","method":"sync_info"}' \
-H 'Content-Type: application/json'
monerod
will format an array of maps that contains named objects like such:
{
"array": [{
"named_object": {
"field": ""
}
},{
"named_object": {
"field": ""
}
}]
}
Cuprate will format the above like such:
{
"array": [
{
"named_object": {
"field": ""
}
},
{
"named_object": {
"field": ""
}
}
]
}
JSON strictness
This is a list of behavior that monerod
's JSON parser allows, that Cuprate's JSON parser (serde_json
) does not.
In general, monerod
's parser is quite lenient, allowing invalid JSON in many cases.
Cuprate's (really, serde_json
) JSON parser is quite strict, essentially sticking to
the JSON specification.
Cuprate also makes some decisions that are different than monerod
, but are not necessarily more or less strict.
Missing closing bracket
monerod
will accept JSON missing a final closing }
.
Example:
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":"2.0","id":"0","method":"get_block_count"' \
-H 'Content-Type: application/json'
Trailing ending comma
monerod
will accept JSON containing a final trailing ,
.
Example:
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":"2.0","id":"0","method":"get_block_count",}' \
-H 'Content-Type: application/json'
Allowing -
in fields
monerod
allows -
as a valid value in certain fields, not a string "-"
, but the character -
.
The fields where this is allowed seems to be any field monerod
does not explicitly look for, examples include:
jsonrpc
id
params
(where parameters are not expected)- Any ignored field
The JSON-RPC 2.0 specification does state that the response id
should be null
upon errors in detecting the request id
, although in this case, this is invalid JSON and should not make it this far. The response will contain the default id: 0
in this case.
Example:
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":-,"id":-,"params":-,"IGNORED_FIELD":-,"method":"get_block_count"}' \
-H 'Content-Type: application/json'
JSON-RPC strictness
This is a list of behavior that monerod
's JSON-RPC implementation allows, that Cuprate's JSON-RPC implementation does not.
In general, monerod
's JSON-RPC is quite lenient, going against the specification in many cases.
Cuprate's JSON-RPC implementation is slightly more strict.
Cuprate also makes some decisions that are different than monerod
, but are not necessarily more or less strict.
Allowing an incorrect jsonrpc
field
The JSON-RPC 2.0 specification states that the jsonrpc
field must be exactly "2.0"
.
monerod
allows jsonrpc
to:
- Be any string
- Be an empty array
- Be
null
- Not exist at all
Examples:
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":"???","method":"get_block_count"}' \
-H 'Content-Type: application/json'
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":[],"method":"get_block_count"}' \
-H 'Content-Type: application/json'
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":null,"method":"get_block_count"}' \
-H 'Content-Type: application/json'
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"method":"get_block_count"}' \
-H 'Content-Type: application/json'
Allowing id
to be any type
JSON-RPC 2.0 responses must contain the same id
as the original request.
However, the specification states:
An identifier established by the Client that MUST contain a String, Number, or NULL value if included
monerod
does not check this and allows id
to be any JSON type, for example, a map:
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":"2.0","id":{"THIS":{"IS":"ALLOWED"}},"method":"get_block_count"}' \
-H 'Content-Type: application/json'
The response:
{
"id": {
"THIS": {
"IS": "ALLOWED"
}
},
"jsonrpc": "2.0",
"result": {
"count": 3210225,
"status": "OK",
"untrusted": false
}
}
Responding with id:0
on error
The JSON-RPC specification states:
If there was an error in detecting the id in the Request object (e.g. Parse error/Invalid Request), it MUST be Null.
Although, monerod
will respond with id:0
in these cases.
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":"2.0","id":asdf,"method":"get_block_count"}' \
-H 'Content-Type: application/json'
Response:
{
"error": {
"code": -32700,
"message": "Parse error"
},
"id": 0,
"jsonrpc": "2.0"
}
Responding to notifications
TODO: decide on Cuprate behavior https://github.com/Cuprate/cuprate/pull/233#discussion_r1704611186
Requests that have no id
field are "notifications".
The JSON-RPC 2.0 specification states that requests without
an id
field must not be responded to.
Example:
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsonrpc":"2.0","method":"get_block_count"}' \
-H 'Content-Type: application/json'
Upper/mixed case fields
monerod
will accept upper/mixed case fields on:
jsonrpc
id
method
however, is checked.
The JSON-RPC 2.0 specification does not outright state what case to support,
although, Cuprate only supports lowercase as supporting upper/mixed case
is more code to add as serde
by default is case-sensitive on struct
fields.
Example:
curl \
http://127.0.0.1:18081/json_rpc \
-d '{"jsONrPc":"2.0","iD":0,"method":"get_block_count"}' \
-H 'Content-Type: application/json'
HTTP methods
monerod
endpoints supports multiple HTTP methods
that do not necessarily make sense.
For example:
curl \
http://127.0.0.1:18081/get_limit \
-H 'Content-Type: application/json' \
--request DELETE
This is sending an HTTP DELETE
request, which should be a GET
.
monerod
will respond to this the same as GET
, POST
, PUT
, and TRACE
.
Cuprate's behavior
TODO: decide allowed HTTP methods for Cuprate https://github.com/Cuprate/cuprate/pull/233#discussion_r1700934928.
RPC payment
The RPC payment system in monerod
is a pseudo-deprecated
system that allows node operators to be compensated for RPC usage.
Although this system is pseudo-deprecated, monerod
still generates related fields in responses. Cuprate follows this behavior.
However, the associated endpoints and actual functionality are not supported by Cuprate. The associated endpoints will return an error upon invocation.
TODO: decide on behavior and document https://github.com/Cuprate/cuprate/pull/233#discussion_r1700870051.
Custom strings
Many JSON response fields contain strings with custom messages.
This may be error messages, status, etc.
Although the field + string type will be followed, Cuprate will not always have the exact same message, particularly when it comes to error messages.
Unsupported RPC calls
TODO: compile unsupported RPC calls after handlers are created.
RPC calls with different behavior
TODO: compile RPC calls with different behavior after handlers are created.
βͺοΈ ZMQ
TODO
βͺοΈ Consensus
βͺοΈ Verifier
βͺοΈ TODO
βͺοΈ Networking
βͺοΈ P2P
βͺοΈ Dandelion++
βͺοΈ Proxy
βͺοΈ Tor
βͺοΈ i2p
βͺοΈ IPv4/IPv6
Instrumentation
Cuprate is built with instrumentation in mind.
βͺοΈ Logging
βͺοΈ Data collection
βͺοΈ Binary
βͺοΈ CLI
βͺοΈ Config
βͺοΈ Logging
Resources
βͺοΈ File system
Index of PATHs
This is an index of all of the filesystem PATHs Cuprate actively uses.
The cuprate_helper::fs
module defines the general locations used throughout Cuprate.
dirs
is used internally, which follows
the PATH standards/conventions on each OS Cuprate supports, i.e.:
- the XDG base directory and the XDG user directory specifications on Linux
- the Known Folder system on Windows
- the Standard Directories on macOS
Cache
Cuprate's cache directory.
OS | PATH |
---|---|
Windows | C:\Users\Alice\AppData\Local\Cuprate\ |
macOS | /Users/Alice/Library/Caches/Cuprate/ |
Linux | /home/alice/.cache/cuprate/ |
Config
Cuprate's config directory.
OS | PATH |
---|---|
Windows | C:\Users\Alice\AppData\Roaming\Cuprate\ |
macOS | /Users/Alice/Library/Application Support/Cuprate/ |
Linux | /home/alice/.config/cuprate/ |
Data
Cuprate's data directory.
OS | PATH |
---|---|
Windows | C:\Users\Alice\AppData\Roaming\Cuprate\ |
macOS | /Users/Alice/Library/Application Support/Cuprate/ |
Linux | /home/alice/.local/share/cuprate/ |
Blockchain
Cuprate's blockchain directory.
OS | PATH |
---|---|
Windows | C:\Users\Alice\AppData\Roaming\Cuprate\blockchain\ |
macOS | /Users/Alice/Library/Application Support/Cuprate/blockchain/ |
Linux | /home/alice/.local/share/cuprate/blockchain/ |
Transaction pool
Cuprate's transaction pool directory.
OS | PATH |
---|---|
Windows | C:\Users\Alice\AppData\Roaming\Cuprate\txpool\ |
macOS | /Users/Alice/Library/Application Support/Cuprate/txpool/ |
Linux | /home/alice/.local/share/cuprate/txpool/ |
Database
Cuprate's database location/filenames depend on:
- Which database it is
- Which backend is being used
cuprate_blockchain
files are in the above mentioned blockchain
folder.
cuprate_txpool
files are in the above mentioned txpool
folder.
If the heed
backend is being used, these files will be created:
Filename | Purpose |
---|---|
data.mdb | Main data file |
lock.mdb | Database lock file |
For example: /home/alice/.local/share/cuprate/blockchain/lock.mdb
.
If the redb
backend is being used, these files will be created:
Filename | Purpose |
---|---|
data.redb | Main data file |
For example: /home/alice/.local/share/cuprate/txpool/data.redb
.
Sockets
Index of ports
This is an index of all of the network sockets Cuprate actively uses.
βͺοΈ Memory
Concurrency and parallelism
It is incumbent upon software like Cuprate to take advantage of today's highly parallel hardware as much as practically possible.
With that said, programs must setup guardrails when operating in a concurrent and parallel manner, for correctness and safety.
There are "synchronization primitives" that help with this, common ones being:
These tools are relatively easy to use in isolation, but trickier to do so when considering the entire system. It is not uncommon for the bottleneck to be the poor orchastration of these primitives.
Analogy
A common analogy for a parallel system is an intersection.
Like a parallel computer system, an intersection contains:
- Parallelism: multiple individual units that want to move around (cars, pedestrians, etc)
- Synchronization primitives: traffic lights, car lights, walk signals
In theory, the amount of "work" the units can do is only limited by the speed of the units themselves, but in practice, the slow cascading reaction speeds between all units, the frequent hiccups that can occur, and the synchronization primitives themselves become bottlenecks far before the maximum speed of any unit is reached.
A car that hogs the middle of the intersection on the wrong light is akin to a system thread holding onto a lock longer than it should be - it degrades total system output.
Unlike humans however, computer systems at least have the potential to move at lightning speeds, but only if the above synchronization primitives are used correctly.
Goal
To aid the long-term maintenance of highly concurrent and parallel code, this section documents:
- All system threads spawned and maintained
- All major sections where synchronization primitives are used
- The asynchronous behavior of some components
and how these compose together efficiently in Cuprate.
βͺοΈ Map
βͺοΈ The RPC server
βͺοΈ The database
βͺοΈ The block downloader
βͺοΈ The verifier
βͺοΈ Thread exit
Index of threads
This is an index of all of the system threads Cuprate actively uses.
βͺοΈ External Monero libraries
βͺοΈ Cryptonight
RandomX
https://github.com/tari-project/randomx-rs
monero_serai
https://github.com/serai-dex/serai/tree/develop/coins/monero
Benchmarking
Cuprate has 2 types of benchmarks:
- Criterion benchmarks
cuprate-benchmark
benchmarks
Criterion is used for micro benchmarks; they time single functions, groups of functions, and generally are small in scope.
cuprate-benchmark
and cuprate-benchmark-lib
are custom in-house crates Cuprate uses for macro benchmarks; these test sub-systems, sections of a sub-system, or otherwise larger or more complicated code that isn't well-suited for micro benchmarks.
File layout and purpose
All benchmarking related files are in the benches/
folder.
This directory is organized like such:
Directory | Purpose |
---|---|
benches/criterion/ | Criterion (micro) benchmarks |
benches/criterion/cuprate-* | Criterion benchmarks for the crate with the same name |
benches/benchmark/ | Cuprate's custom benchmarking files |
benches/benchmark/bin | The cuprate-benchmark crate; the actual binary run that links all benchmarks |
benches/benchmark/lib | The cuprate-benchmark-lib crate; the benchmarking framework all benchmarks plug into |
benches/benchmark/cuprate-* | cuprate-benchmark benchmarks for the crate with the same name |
Criterion
Each sub-directory in benches/criterion/
is a crate that uses Criterion for timing single functions and/or groups of functions.
They are generally be small in scope.
Creating
Creating a new Criterion-based benchmarking crate for one of Cuprate's crates is relatively simple, although, it requires knowledge of how to use Criterion first:
- Read the
Getting Started
section of https://bheisler.github.io/criterion.rs/book - Copy
benches/criterion/example
as base - Get started
Naming
New benchmark crates using Criterion should:
- Be in
benches/criterion/
- Be in the
cuprate-criterion-$CRATE_NAME
format
For a real example, see:
cuprate-criterion-json-rpc
.
Workspace
Finally, make sure to add the benchmark crate to the workspace
Cargo.toml
file.
Your benchmark is now ready to be ran.
Running
To run all Criterion benchmarks, run this from the repository root:
cargo bench
To run specific package(s), use:
cargo bench --package $CRITERION_BENCHMARK_CRATE_NAME
For example:
cargo bench --package cuprate-criterion-json-rpc
cuprate-benchmark
Cuprate has 2 custom crates for general benchmarking:
cuprate-benchmark
; the actual binary crate rancuprate-benchmark-lib
; the library that other crates hook into
The abstract purpose of cuprate-benchmark
is very simple:
- Set-up the benchmark
- Start timer
- Run benchmark
- Output data
cuprate-benchmark
runs the benchmarks found in benches/benchmark/cuprate-*
.
cuprate-benchmark-lib
defines the Benchmark
trait that all
benchmark crates implement to "plug-in" to the benchmarking harness.
Diagram
A diagram displaying the relation between cuprate-benchmark
and related crates.
βββββββββββββββββββββββ
β cuprate_benchmark β
β (actual binary ran) β
ββββββββββββ¬βββββββββββ
ββββββββββββββββββββ΄ββββββββββββββββββββ
β cuprate_benchmark_lib β
β ββββββββββββββββββββββββββββββββββββββ
β β trait Benchmark ββ
β ββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββ¬ββββββββββββββββββββ
βββββββββββββββββββββββββββββ β βββββββββββββββββββββββββββββ
β cuprate_benchmark_example ββββΌββββ€ cuprate_benchmark_* β
βββββββββββββββββββββββββββββ β βββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββ β βββββββββββββββββββββββββββββ
β cuprate_benchmark_* ββββ΄ββββ€ cuprate_benchmark_* β
βββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ
Creating
New benchmarks are plugged into cuprate-benchmark
by:
- Implementing
cuprate_benchmark_lib::Benchmark
- Registering the benchmark in the
cuprate_benchmark
binary
See benches/benchmark/example
for an example.
Creating the benchmark crate
Before plugging into cuprate-benchmark
, your actual benchmark crate must be created:
- Create a new crate inside
benches/benchmark
(consider copyingbenches/benchmark/example
as a base) - Pull in
cuprate_benchmark_lib
as a dependency - Create a benchmark
- Implement
cuprate_benchmark_lib::Benchmark
New benchmark crates using cuprate-database
should:
- Be in
benches/benchmark/
- Be in the
cuprate-benchmark-$CRATE_NAME
format
For a real example, see:
cuprate-benchmark-database
.
cuprate_benchmark_lib::Benchmark
This is the trait that standardizes all benchmarks ran under cuprate-benchmark
.
It must be implemented by your benchmarking crate.
See cuprate-benchmark-lib
crate documentation for a user-guide: https://doc.cuprate.org/cuprate_benchmark_lib.
Adding a feature to cuprate-benchmark
After your benchmark's behavior is defined, it must be registered
in the binary that is actually ran: cuprate-benchmark
.
If your benchmark is new, add a new crate feature to cuprate-benchmark
's Cargo.toml file with an optional dependency to your benchmarking crate.
Please remember to edit the feature table in the
README.md
as well!
Adding to cuprate-benchmark
's main()
After adding your crate's feature, add a conditional line that run the benchmark
if the feature is enabled to the main()
function:
For example, if your crate's name is egg
:
#![allow(unused)] fn main() { cfg_if! { if #[cfg(feature = "egg")] { run::run_benchmark::<cuprate_benchmark_egg::Benchmark>(&mut timings); } } }
Workspace
Finally, make sure to add the benchmark crate to the workspace
Cargo.toml
file.
Your benchmark is now ready to be ran.
Running
cuprate-benchmark
benchmarks are ran with this command:
cargo run --release --package cuprate-benchmark --features $BENCHMARK_CRATE_FEATURE
For example, to run the example benchmark:
cargo run --release --package cuprate-benchmark --features example
Use the all
feature to run all benchmarks:
# Run all benchmarks
cargo run --release --package cuprate-benchmark --features all
βͺοΈ Testing
βͺοΈ Monero data
βͺοΈ RPC client
βͺοΈ Spawning monerod
βͺοΈ Known issues and tradeoffs
βͺοΈ Networking
βͺοΈ RPC
βͺοΈ Storage
Monero oddities
This section is a list of any peculiar, interesting, or non-standard behavior that Monero has that is not planned on being changed or deprecated.
This section exists to hold all the small yet noteworthy knowledge in one place, instead of in any single contributor's mind.
These are usually behaviors stemming from implementation rather than protocol/cryptography.
Formatting
This is the markdown formatting for each entry in this section.
If applicable, consider using this formatting when adding to this section.
# <concise_title_of_the_behavior>
## What
A detailed description of the behavior.
## Expected
The norm or standard behavior that is usually expected.
## Why
The reasoning behind why this behavior exists and/or
any links to more detailed discussion on the behavior.
## Affects
A (potentially non-exhaustive) list of places that this behavior can/does affect.
## Example
An example link or section of code where the behavior occurs.
## Source
A link to original `monerod` code that defines the behavior.
Little-endian IPv4 addresses
What
Monero encodes IPv4 addresses in little-endian byte order.
Expected
In general, networking-related protocols/code use networking order (big-endian).
Why
TODO
- https://github.com/monero-project/monero/issues/3826
- https://github.com/monero-project/monero/pull/5544
Affects
Any representation and (de)serialization of IPv4 addresses must keep little
endian in-mind, e.g. the P2P wire format or int
encoded IPv4 addresses in RPC.
For example, the ip
field in set_bans
.
For Cuprate, this means Rust's Ipv4Addr::from_bits/from
cannot be used in these cases as it assumes big-endian encoding.
Source
Appendix
Crates
This is an index of all of Cuprate's in-house crates it uses and maintains.
They are categorized into groups.
Crate documentation for each crate can be found by clicking the crate name or by visiting https://doc.cuprate.org. Documentation can also be built manually by running this at the root of the cuprate
repository:
cargo doc --package $CRATE
For example, this will generate and open cuprate-blockchain
documentation:
cargo doc --open --package cuprate-blockchain
Consensus
Crate | In-tree path | Purpose |
---|---|---|
cuprate-consensus | consensus/ | TODO |
cuprate-consensus-context | consensus/context/ | TODO |
cuprate-consensus-rules | consensus/rules/ | TODO |
cuprate-fast-sync | consensus/fast-sync/ | Fast block synchronization |
Networking
Crate | In-tree path | Purpose |
---|---|---|
cuprate-epee-encoding | net/epee-encoding/ | Epee (de)serialization |
cuprate-fixed-bytes | net/fixed-bytes/ | Fixed byte containers backed by byte::Byte |
cuprate-levin | net/levin/ | Levin bucket protocol implementation |
cuprate-wire | net/wire/ | TODO |
P2P
Crate | In-tree path | Purpose |
---|---|---|
cuprate-address-book | p2p/address-book/ | TODO |
cuprate-async-buffer | p2p/async-buffer/ | A bounded SPSC, FIFO, asynchronous buffer that supports arbitrary weights for values |
cuprate-dandelion-tower | p2p/dandelion-tower/ | TODO |
cuprate-p2p | p2p/p2p/ | TODO |
cuprate-p2p-bucket | p2p/bucket/ | A collection data structure discriminating its items into "buckets" of limited size. |
cuprate-p2p-core | p2p/p2p-core/ | TODO |
Storage
Crate | In-tree path | Purpose |
---|---|---|
cuprate-blockchain | storage/blockchain/ | Blockchain database built on-top of cuprate-database & cuprate-database-service |
cuprate-database | storage/database/ | Pure database abstraction |
cuprate-database-service | storage/database-service/ | tower::Service + thread-pool abstraction built on-top of cuprate-database |
cuprate-txpool | storage/txpool/ | Transaction pool database built on-top of cuprate-database & cuprate-database-service |
RPC
Crate | In-tree path | Purpose |
---|---|---|
cuprate-json-rpc | rpc/json-rpc/ | JSON-RPC 2.0 implementation |
cuprate-rpc-types | rpc/types/ | Monero RPC types and traits |
cuprate-rpc-interface | rpc/interface/ | RPC interface & routing |
cuprate-rpc-handler | rpc/handler/ | RPC inner handlers |
ZMQ
Crate | In-tree path | Purpose |
---|---|---|
cuprate-zmq-types | zmq/types/ | Message types for ZMQ Pub/Sub interface |
1-off crates
Crate | In-tree path | Purpose |
---|---|---|
cuprate-constants | constants/ | Shared const/static data across Cuprate |
cuprate-cryptonight | cryptonight/ | CryptoNight hash functions |
cuprate-pruning | pruning/ | Monero pruning logic/types |
cuprate-helper | helper/ | Kitchen-sink helper crate for Cuprate |
cuprate-test-utils | test-utils/ | Testing utilities for Cuprate |
cuprate-types | types/ | Shared types across Cuprate |
Benchmarks
Crate | In-tree path | Purpose |
---|---|---|
cuprate-benchmark | benches/benchmark/bin/ | Cuprate benchmarking binary |
cuprate-benchmark-lib | benches/benchmark/lib/ | Cuprate benchmarking library |
cuprate-benchmark-* | benches/benchmark/cuprate-* | Benchmark for a Cuprate crate that uses cuprate-benchmark |
cuprate-criterion-* | benches/criterion/cuprate-* | Benchmark for a Cuprate crate that uses Criterion |
Contributing
https://github.com/Cuprate/cuprate/blob/main/CONTRIBUTING.md
Build targets
- x86
- ARM64
- Windows
- Linux
- macOS
- FreeBSD(?)
Protocol book
https://monero-book.cuprate.org