amaru uses the tracing crate and OpenTelemetry infrastructure for observability: both logs, traces and metrics.
Those concepts have the particularity to be at the frontier between devs, users and DevOps and require specific care so that their true potential can be achieved.
While the current observability implementation in amaru is a great foundation we identified some shortcomings. This is an attempt to improve the situation so that everyone feels more confident in their day to day use of those.
- For amaru DEVs, how to figure out when and how to add traces?
- For amaru users (DevOps/pool operators/devs): consistency, easy discoverability and confidence (i.e. API stability)
- For DevOps specifically: global coherency (i.e. are there holes in the global picture), is the granularity well-defined?
This means that Tracing is considered a product interface.
Internally amaru uses the tracing crate as a mean to express both the concepts of traces and logs.
It also uses the opentelemetry crates to:
- Define the semantics of
tracesandspansare logically connected viaspan_id. They can be nested viaparent_id. - Configure the collection of observability data.
- Configure the export of observability data.
The semantics provided by the tracing and opentelemetry libraries are a bit different and can appear confusing at
first.
The tracing crate does not represents directly traces and logs. It generates 2 kinds of diagnostic information:
spans
From the documentation:
Unlike a log line that represents a moment in time, a span represents a period of time with a beginning and an end.
spans are created via the span! macro and its derivatives (e.g. span_info!).
spans are logically connected via span_id. They can be nested via parent_id.
events
From the documentation:
Events are comparable to the log records emitted by unstructured logging code, but unlike a typical log line,
an Event may occur within the context of a span.
events are created via the event! macro and its derivatives (e.g. info!).
Both spans and events are then funneled to subscribers that act as global point of collection.
OpenTelemetry (OTEL for short) defines a number of concepts, including:
- log record.
- event is a specific type of log record.
- span
A trace is a logical grouping of spans based on a trace_id (see SpanContext).
Events can be attached to spans. In Rust, this is done via add_event_with_timestamp.
A log record can be optionally associated to a span if emitted while a span is active via the trace_id and span_id fields.
A tracing event emitted with the event! macro or its derivatives is collected as an OTEL log record.
A tracing span emitted with the span! macro or its derivatives is collected as an OTEL span.
NOTE: tracing events are not equivalent to OpenTelemetry span events. amaru currently doesn't create any span event.
Nonetheless tracing events (e.g. info!), i.e. an OTEL log record, is linked to the surrounding span at runtime and
collected by the OTEL libraries as a log record connected to a span, but not as an OTEL span event.
In amaru the observability data is collected via a Registry,
which stacks several Layers offering different behaviours for
collecting and exporting observability data.
- A JSON "formatting layer" (created with
tracing::fmt::layer()) to collect spans and events when they are emitted, and format them as JSON to be written tostderr. - An OTEL layer, provided by the
tracing-opentelemetrywhich also captures spans and events but to export them as OTEL spans and log records to an OpenTelemetry Collector.
In each case the configuration of the layer decides which part of a span lifecycle can trigger the export / printing of data Event can be attached to spans. In Rust, this is done via add_event_with_timestamp. if specific filters need to be added.
Also note that OpenTelemetry exporters propose additional configuration for batching and exporting data.
In general all log records are exported but spans are sampled based on custom heuristics.
spans are what 3rd party depends on. Cannot break, must be discoverable and documented.
events are for human, what is typically considered logs. Used as structured logs anyway.
Next action items would then be:
- introduce and use a alternative to
instrumentmacros tailored to amaru needs and strongly typed - update the existing EDR to reflect those
- develop some tooling to help with preventing regressions and generating scemas/documentation (i.e. a source code parsing crate)
- integrate those tooling in amaru CI
Have a dedicated amaru-observability crate. Only this crate can depend on tracing crate; direct use of the tracing crate is forbidden via clippy rules.
This enforces that only traces/events with the right details and leveraging Rust types can be used.
amaru-observability will define a number of macros/fns wrapping tracing. Mostly it involves reducing the API surface and introducing more typing.
- events ends up as logs (stderr only)
- spans ends up as OTEL spans (OLTP, machine readable JSON via stdout)
levelis alwaysTRACE- spans have
name, but nomessage; a name is a simple string w/o any. targetmust be set; they follow a::separated tree hierarchy (e.g.amaru::ledger::store)- favor fewer but bigger (i.e. more fields) spans
fieldare common accross spans as much as possible; they have canonical types- span events are not used
targetandnameis defined using Rust types- only use runtime
fieldif cannot be avoided - always prefer using
tracemacro; including if it requires the introduction of an extra fn - no default fields, returned value show by default,
Resulthand-rolled - only
Displayfields, noDebug set_span_kind(fromOpenTelemetrySpanExt) might be relevant when interracting with the network (e.g. inpure-stage)- when too cubersome to trace all fields at once, use our typed version of
Span::current().record
Being explicit and not relying on defaults improve the robustness and makes spans refactoring resistant.
For traces, a radical idea would be to just use the trace level always. We only ever use the info level for traces that we want to show up in human logs: a better way to do that would be to just create extra logs here (e.g. using info!).
Trace hierarchy will come from their associated modules hierarchy (see more details in the syntax section). Therefore they must not contain and ., just simple names.
The event! macros will make sure parent_span_id i correctly propagated in most cases but when new threads are created. This might happen in amaru (e.g. in pure-stage) so specific care is mandatory: the parent span must be manually restored.
Alternatively new threads created (regular or tokyo) must use in_current_span or instrument(span).
Also consider when implementing a subscriber that ENTER and EXIT events might be called multiple times in async context. CLOSE will only ever be called once, when the async block is dropped.
A potential syntax could be:
// In amaru-observability/src/traces.rs
define_schemas! {
consensus {
chain_sync {
VALIDATE_HEADER {
required {
point_slot: Slot,
point_hash: Hash,
}
optional {
peer_id: PeerId
}
}
}
}
}
// In some amaru module
#[trace(consensus::chain_sync::VALIDATE_HEADER)]
fn import_headers(point_slot: Slot, point_hash: Hash) {
do_extra_stuff(peer_id);
}
#[augment-trace(consensus::chain_sync::VALIDATE_HEADER)]
fn do_extra_stuff((peer_id: PeerId) {
}
// `trace` macro ensures that all function arguments matches
// `target` is inferred from the nested structure of the name symbol: `amaru::consensus::chain_sync`
// All function arguments are used as `fields` (except `self`). All those must be `Display`able. This might forces the introduction of extra dedicated function but that sounds a reasonnable tradeoff and might guide us toward better function definition.
// `augment-trace` adds extra fields to the current span. Annotated function argument must match associated `optional` fields.
Only instrument architectural boundaries.
Examples:
- public APIs
- async task entry points
- IO boundaries
- state machine transitions
Avoid:
- inner loops
- helpers
- pure functions
This keeps traces meaningful and stable.
All spans must be documented using JSON schemas. While documentation is desirable, it's pretty useless if users can't easily find it. Some mechanism allows user to easily discover and understand which traces is emited when.
Some documentation JSON schemas standard exists but focuses on changes. It might be amended to also document current state.
- https://github.com/open-telemetry/weaver
- https://github.com/open-telemetry/semantic-conventions/blob/main/model/http/spans.yaml
- https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/schemas
- https://github.com/open-telemetry/opentelemetry-specification/blob/main/oteps/0152-telemetry-schemas.md
Follow semantic conventions as much as possible. OpenTelemetry dedicated significant energy to define generic elements that matches a lot of scenarios. Using them ensures a smooth integration with the various OTEL tooling.
Some specific conventions worth considering:
CI help us checking if commits break existing spans. This is done by relying on a yet-to-be-created tool that will parse the whole codebase to extract span definitions.
Ideally some logical links between those spans will also be extracted by this tool.
Both documentation and regression alerts are generated from those.
As an effort to prevent higher-level regression (full traces) some testing is mandatory. Insta Snapshots tests are used to ensure well-known scenarios stay consistent.
Seems like the whole "Event can be attached to spans. In Rust, this is done via add_event_with_timestamp." is extra? Especially because the exact same sentence can be read above 😅 ?
I disagree about the second part.
eventshould also be treated as an API and deemed both stable and consistent. The human side of observability is mostly a matter of presentation IMO. Humans don't necessarily want to read raw JSON logs (although some do!); but would rather process them through tools that allows for formatting, filtering and coloring them. Or more generally, a "pretty" output is what humans expect. But the source data is the same.Seems like a reasonable idea to keep the log API under control. Provided that we can create a sufficiently decent and friendly API (the proc attribute macros like
instrumentfrom tracing are definitely handy!).My preference would be that what end up on stderr is a matter of severity. At least currently, this is how it's setup: only
DEBUGand above ends up onstderr, unless JSON export is activated. Otherwise it has proven confusing to get two streams on both stderr and stdout.Can we be more specific here, especially with regard to open telemetry convention. By default, targets are set to the module they're from; which makes them quite unstable. Also; if traces end up being all defined in a separate crate, it may become tricky.
So I believe it's much more practical to define namespaces based on the domain they apply to (ledger, consensus, plutus, ...) when relevant, and otherwise following OTL semantic conventions when applicable (network, db, http, ...).
In particular, one thing we haven't been much careful about is to leverage span hierarchy and contexts. For example, if a top-level span already captures slot and epoch as a field, it shouldn't be necessary to add "epoch" or "slot" as a field in any of the child event or span. Often enough, we've been duplicating stuff like this.
Although this is a bit tricky because we can't reliably see parent's span fields when defining child spans. So I am not sure how we can propagate context, and ensure that the hierarchy contains the relevant information.
This ended up being annoying back then, because some spans don't have child events (and perhaps that's the root cause of the issue), which made them unavailable to "humans" through the exported pretty logs. So I ended up adding an event on span's end iirc, so that even "empty spans" would be visible from the outside.
Not sure to follow?
infoordebughave roughly the same semantic in the codebase today. I think my preference would be:traceas first choice, unless too noisy to be showed on a stderr feedinfofor events relevant to humanswarningfor unexpected behavior that aren't immediately causing problem, but might.errorfor things that should definitely not happenThen remains
debug, which we can use only for ... debugging! So, there should not be any debug onmain, but we may add some when troubleshooting issues to easily filter on them and distinguish them from "normal" logs.