Observability and Telemetry
Observability in a piece of software is anything that aids in understanding “what is this program doing”. This can include logs, metrics on timing, endpoint call count, etc.
The Caliper SDK has several built in observability layers.
Note
We do not currenly have a way for external developers to pull their own telemetry stats.
Logging
Logging is talked about in detail in several other places throughout the documentation. These logs are viewable when running locally streaming directly to your terminal, then via Q2Developer.com Self Service for external or Caliper Deployment Manager (CDM) for internal Q2 developers.
- Access for local dev:
Streams directly to your terminal
Can be configured via log levels (ex.
q2 run -l DEBUG
)Can be redirected to a file (ex.
q2 run | tee out.log
)
- Access in datacenter:
- Q2 developers
Caliper Deployment Manager (CDM) (internal devs)
Alexandria (internal devs)
- External developers
Q2Developer.com Self Service
OpenTelemetry
OpenTelemetry was added in SDK version 2.151.0 as a one size fits most observability solution. We really like its simultaneous ability to capture granular tracing information at a function level while also giving aggregated metrics to drill down through. OpenTelemetry is a spec rather than a product and is often visualized by another interface such as Jaeger or Splunk. Assuming you have docker installed on your machine, here’s a convenient command to boot up a local collector:
$ docker run -d --rm --name jaeger \
-e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
-e COLLECTOR_OTLP_ENABLED=true \
-p 6831:6831/udp \
-p 6832:6832/udp \
-p 5778:5778 \
-p 16686:16686 \
-p 4317:4317 \
-p 4318:4318 \
-p 14250:14250 \
-p 14268:14268 \
-p 14269:14269 \
-p 9411:9411 \
jaegertracing/all-in-one:1.36 > /dev/null
With that done, you can enable this in your local SDK service with the Q2SDK_ENABLE_OPEN_TELEMETRY=True
environment variable:
$ Q2SDK_ENABLE_OPEN_TELEMETRY=True q2 run
This will give you logs with this line in it:
OpenTelemetry Enabled: True
By default, the SDK keeps track of:
- Handler Metrics:
Duration
HTTP method
Extension name
Response StatusCode
Request ID
- Q2 Requests Network Calls
Duration
Headers
HTTP Method
Response StatusCode
URL called
- HQ Calls
Duration
Endpoint
Module (Q2Api/WedgeOnlineBanking)
HQ Url
Args
- Vault
Duration
Errors
Path requested
- SDK Metadata
Version
Adding your own spans is possible:
from q2_sdk.core.opentelemetry.span import Q2Span
@Q2Span.instrument()
def function_name(param_name):
#do work
This will add a span called function_name that will attach itself to the appropriate location in the request hierarchy. Timing and param_name tracking is handled automatically.
To customize further, such as overriding your name, hiding a parameter from tags, or changing the SpanKind in case OpenTelemetry requires it for metric dashboarding:
from q2_sdk.core.opentelemetry.span import Q2Span, SpanKind
@Q2Span.instrument(name="custom_span_name" skip=['secret_param'], kind=SpanKind.SERVER)
def function_name(param_name, secret_param):
#do work
Want to add attributes (fields) to your spans? No problem! There’s a helper for that:
from q2_sdk.core.opentelemetry.span import Q2Span
@Q2Span.instrument()
def function_name():
Q2Span.set_attribute('spam', 'eggs') # One attribute
Q2Span.set_attributes(
{
'spam': 'eggs',
'foo': 'bar',
}
) # Multiple attributes
Or log an event on your span?:
from q2_sdk.core.opentelemetry.span import Q2Span
@Q2Span.instrument()
def function_name():
Q2Span.add_event("EventName", {"value": "EventValue"})
In all these cases, Q2Span will work in the context of the existing span it was called from. No need to do fancy passing along of state!
If we implement all these in an extension, we get an output that looks like the following:

Prometheus
Prometheus provides aggregated data about how many times endpoints were called, average response time, etc. If you’ve noticed the builtin /metrics endpoint in your running SDK extension, that’s the exposed data that the Prometheus tool internally here at Q2 polls on a regular basis.
By default, the SDK keeps track of:
- Handler Metrics:
Histogram of call Durations
HTTP method
Extension name
Response StatusCode
- Q2 Requests Network Calls
Histogram of call Durations
HTTP Method
URL called
- HQ Calls
Histogram of call Durations
Endpoint Name
Module (Q2Api/WedgeOnlineBanking)
HQ Url
- Cache
Request type (get/set/delete)
Histogram of call Durations
- SDK Metadata
Failed forked process count
Requests in progress
We generally prefer the use of OpenTelemetry for this type of data these days, but nevertheless it is possible to add your own prometheus metrics:
from q2_sdk.core.prometheus import MetricType, get_metric
prometheus.get_metric(
MetricType.Counter,
'caliper_endpoints',
'Number of times called',
labels={'endpoint': self.extension_name}
).inc()