@manifest-cyber/observability-ts
Unified observability library for Manifest Cyber's TypeScript services - Prometheus metrics and OpenTelemetry tracing.
Installation
npm install @manifest-cyber/observability-ts
# Optional: for tracing features
npm install @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-grpcQuick Start
Metrics
import { createCounter, startMetricsServer } from '@manifest-cyber/observability-ts';
const requestsTotal = createCounter({
name: 'http_requests_total',
help: 'Total HTTP requests',
labelNames: ['method', 'status'],
});
await startMetricsServer({ serviceName: 'my-service' });
// Service name is automatically added as a label
requestsTotal.inc({ method: 'GET', status: '200' });
// Results in: http_requests_total{service_name="my-service",method="GET",status="200"}Tracing
import { initTracing, withSpan } from '@manifest-cyber/observability-ts';
await initTracing({ serviceName: 'my-service' });
await withSpan('process.request', async (span) => {
span.setAttribute('user.id', userId);
return await processRequest();
});Tree-Shakeable Imports
import { createCounter } from '@manifest-cyber/observability-ts/metrics';
import { initTracing } from '@manifest-cyber/observability-ts/tracing';Features
Metrics (Prometheus)
- Counter, Gauge, Histogram, Summary
- Automatic
service_namelabel injection - HTTP metrics server (
/metricson port 9090) - Timer utilities and operation tracking
- Type-safe with TypeScript generics
Tracing (OpenTelemetry)
- W3C Trace Context propagation
- OTLP export (VictoriaTraces, Jaeger, Tempo)
- Automatic HTTP/gRPC/database instrumentation
- Manual span creation with
createSpan()andwithSpan() - SQS and RabbitMQ trace propagation
- Logger integration for trace correlation
API Overview
Metrics
import { createCounter, createHistogram, createGauge, BUCKETS } from '@manifest-cyber/observability-ts/metrics';
// Counter - automatically suffixed with _total
const counter = createCounter({
name: 'operations', // becomes: operations_total
help: 'Total operations',
labelNames: ['type', 'status'],
});
// service_name label is automatically added
counter.inc({ type: 'api', status: 'success' });
// Results in: operations_total{service_name="my-service",type="api",status="success"}
// Or provide full name (won't duplicate suffix)
const counter2 = createCounter({
name: 'http_requests_total', // stays: http_requests_total
help: 'Total HTTP requests',
});
// Histogram - automatically suffixed with _seconds (or custom unit)
const histogram = createHistogram({
name: 'request_duration', // becomes: request_duration_seconds
help: 'Request duration',
labelNames: ['route', 'method'],
buckets: BUCKETS.DURATION.FAST, // Pre-configured buckets for fast operations
});
histogram.observe({ route: '/api/users', method: 'GET' }, 0.42);
// Custom unit
const fileSize = createHistogram({
name: 'file_size', // becomes: file_size_bytes
help: 'File size distribution',
unit: 'bytes',
buckets: BUCKETS.SIZE.SMALL, // Pre-configured buckets for file sizes
});
// Or provide full name (won't duplicate suffix)
const duration = createHistogram({
name: 'processing_duration_milliseconds', // stays: processing_duration_milliseconds
help: 'Processing duration',
});
// Unit detection: name wins over parameter
const conflicted = createHistogram({
name: 'response_time_milliseconds',
unit: 'seconds', // ⚠️ Logs warning, uses milliseconds from name
buckets: BUCKETS.DURATION.FAST,
});
// Result: response_time_milliseconds (name's unit wins)
// Gauge - descriptive name
const gauge = createGauge({
name: 'active_connections',
help: 'Active connections',
labelNames: ['type'],
});
gauge.set({ type: 'http' }, 42);
// Metric names are automatically normalized:
// - Converted to lowercase
// - Spaces and hyphens become underscores
// - Invalid characters replaced with underscores (only alphanumeric, underscore, colon allowed)
// - Must start with letter or underscore
const normalized = createCounter({
name: 'HTTP Requests', // → http_requests_total
help: 'Example of normalization',
});
// Special character handling examples:
createCounter({ name: 'http.requests' }) // → http_requests_total
createCounter({ name: 'http:requests' }) // → http:requests_total (colon allowed)
createCounter({ name: 'http_requests!' }) // → http_requests__total (! → _)
createCounter({ name: 'http@requests#count' }) // → http_requests_count_total
createCounter({ name: '123invalid' }) // → _123invalid_total (prefixed with _)
// Unit detection for histograms (works after normalization):
// - Names are normalized first (hyphens→underscores, lowercase, etc.)
// - If normalized name has unit suffix (e.g., _seconds, _bytes), it's used
// - If name lacks unit, the 'unit' parameter is appended (default: 'seconds')
// - If both present and differ, name wins with a warning
const autoUnit = createHistogram({
name: 'latency', // → latency_seconds (default)
buckets: BUCKETS.DURATION.HTTP,
});
// Unit detection works with hyphens (normalized first):
createHistogram({ name: 'request-duration-seconds', buckets: BUCKETS.DURATION.FAST })
// → request_duration_seconds (not request_duration_seconds_seconds)
createHistogram({ name: 'file-size-bytes', buckets: BUCKETS.SIZE.MEDIUM })
// → file_size_bytes (detects bytes after normalization)Pre-configured Histogram Buckets
The library provides BUCKETS constants for common histogram use cases:
import { BUCKETS } from '@manifest-cyber/observability-ts/metrics';
// Duration buckets (in seconds)
BUCKETS.DURATION.FAST // [10ms, 50ms, 100ms, 500ms, 1s, 2s, 5s]
BUCKETS.DURATION.MEDIUM // [100ms, 500ms, 1s, 2s, 5s, 10s, 30s, 1m]
BUCKETS.DURATION.LONG // [1s, 5s, 10s, 30s, 1m, 2m, 5m, 10m, 20m]
BUCKETS.DURATION.DB // [10ms, 50ms, 100ms, 500ms, 1s, 2s, 5s, 10s]
BUCKETS.DURATION.HTTP // [1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 2s, 5s]
// Size buckets (in bytes)
BUCKETS.SIZE.SMALL // [1KB, 10KB, 100KB, 1MB, 10MB, 100MB]
BUCKETS.SIZE.MEDIUM // [100KB, 1MB, 10MB, 100MB, 1GB]
BUCKETS.SIZE.LARGE // [1MB, 10MB, 100MB, 1GB, 10GB, 100GB]
BUCKETS.SIZE.RESPONSE // [1KB, 10KB, 100KB, 1MB, 10MB]Examples:
// HTTP request duration
const httpDuration = createHistogram({
name: 'http_request_duration',
help: 'HTTP request duration',
buckets: BUCKETS.DURATION.HTTP,
});
// Database query duration
const dbDuration = createHistogram({
name: 'db_query_duration',
help: 'Database query duration',
buckets: BUCKETS.DURATION.DB,
});
// File upload size
const uploadSize = createHistogram({
name: 'upload_size',
help: 'Upload file size',
unit: 'bytes',
buckets: BUCKETS.SIZE.MEDIUM,
});
// Long-running job duration
const jobDuration = createHistogram({
name: 'job_duration',
help: 'Background job duration',
buckets: BUCKETS.DURATION.LONG,
});Tracing
import { initTracing, withSpan, createSpan } from '@manifest-cyber/observability-ts/tracing';
// Initialize
await initTracing({
serviceName: 'my-service',
exporter: {
type: 'otlp-grpc',
endpoint: 'http://localhost:4317',
},
sampling: {
type: 'parentBased',
parentBased: {
root: { type: 'traceIdRatio', ratio: 0.1 }, // 10% sampling
},
},
});
// Automatic span lifecycle
await withSpan('database.query', async (span) => {
span.setAttribute('db.statement', 'SELECT * FROM users');
return await db.query('SELECT * FROM users');
});
// Manual span management
const span = createSpan('manual.operation');
try {
await doWork();
span.setStatus({ code: SpanStatusCode.OK });
} finally {
span.end();
}Trace Propagation
import {
injectTraceContext,
extractTraceContext,
createMessageTraceContext,
extractMessageTraceContext
} from '@manifest-cyber/observability-ts/tracing';
// HTTP Client
const headers = {};
injectTraceContext(headers);
await axios.get('https://api.example.com/users', { headers });
// HTTP Server
app.use((req, res, next) => {
extractTraceContext(req.headers);
next();
});
// SQS Producer
await sqs.sendMessage({
QueueUrl: queueUrl,
MessageBody: JSON.stringify(data),
MessageAttributes: createMessageTraceContext(),
});
// SQS Consumer
extractMessageTraceContext(message.MessageAttributes);Environment Variables
| Variable | Description | Default |
|---|---|---|
SERVICE_NAME |
Service name (added as service_name label to all metrics; also used by initObservability when serviceName is omitted) |
'unknown-service' |
MFST_METRICS_PORT |
Metrics server port (also used by initObservability when metricsPort is omitted) |
9090 |
OTEL_EXPORTER_OTLP_ENDPOINT |
OTLP endpoint | 'http://localhost:4317' |
OTEL_TRACING_ENABLED |
Enable/disable tracing | true |
OTEL_METRICS_EXPORTER |
Metrics exporter used by initObservability: prometheus (scrape), otlp-grpc/otlp-http (push), or none |
prometheus |
ENV |
Environment (dev/staging/prod) | 'development' |
Push-Based Metrics (short-lived processes)
By default, initObservability exposes metrics on a Prometheus scrape endpoint. That model
assumes a long-lived process; short-lived workloads (e.g. single-task jobs running as
KEDA-scaled Kubernetes Jobs) terminate before they are ever scraped. For those, set
OTEL_METRICS_EXPORTER=otlp-grpc (or otlp-http):
- Metrics are pushed to
OTEL_EXPORTER_OTLP_ENDPOINTinstead of being scraped, and the Prometheus scrape server is not started. - The process exit interceptor performs a final metrics export on
process.exit(), so buffered metrics are not lost when the process ends before the periodic export interval (60s) fires.
Process Exit Interceptor
When tracing is enabled (or metrics use a push exporter), process.exit() is automatically intercepted to flush telemetry before termination (default: enabled, 2s timeout). This prevents telemetry loss on abrupt exits.
import { initObservability } from '@manifest-cyber/observability-ts';
await initObservability({
serviceName: 'my-service',
interceptProcessExit: true, // default: true
exitFlushTimeoutMs: 2000, // default: 2000ms
});
// Tracing automatically flushed on exit
if (failed) process.exit(1);Configuration:
interceptProcessExit: boolean- Enable/disable (default:true)exitFlushTimeoutMs: number- Flush timeout in ms (default:2000)logger: Logger- Optional logger for diagnostics
To disable:
await initObservability({
serviceName: 'my-service',
interceptProcessExit: false,
});Example: Express API
import express from 'express';
import {
createCounter,
createHistogram,
BUCKETS,
startMetricsServer,
initTracing,
withSpan,
extractTraceContext,
} from '@manifest-cyber/observability-ts';
await initTracing({ serviceName: 'api-service' });
await startMetricsServer({ serviceName: 'api-service' });
const httpRequests = createCounter({
name: 'http_requests_total',
help: 'Total HTTP requests',
labelNames: ['method', 'route', 'status'],
});
const httpDuration = createHistogram({
name: 'http_request_duration_seconds',
help: 'HTTP request duration',
labelNames: ['method', 'route', 'status'],
buckets: BUCKETS.DURATION.HTTP,
});
const app = express();
app.use((req, res, next) => {
extractTraceContext(req.headers);
const start = Date.now();
res.on('finish', () => {
const duration = (Date.now() - start) / 1000;
const labels = {
method: req.method,
route: req.route?.path || req.path,
status: res.statusCode.toString(),
};
httpRequests.inc(labels);
httpDuration.observe(labels, duration);
});
next();
});
app.get('/api/users/:id', async (req, res) => {
await withSpan('http.GET /api/users/:id', async (span) => {
span.setAttribute('user.id', req.params.id);
const user = await fetchUser(req.params.id);
res.json(user);
});
});
app.listen(3000);Migration from prom-client (v0.2.x → v0.3.x)
v0.3.0 migrates from prom-client to OpenTelemetry SDK for metrics.
Breaking Changes
| Change | Before | After |
|---|---|---|
startMetricsServer() |
Sync | Async (await) |
getRegistry() / resetRegistry() |
Exported | Removed |
reset() / remove() on metrics |
Functional | No-op (API compat) |
| prom-client type re-exports | Available | Removed |
New: Auto-instrumentation
await initObservability({
serviceName: 'my-api',
autoInstrument: ['http', 'express', 'mongo'],
});Optional Peer Dependencies
# Runtime metrics
npm install @opentelemetry/host-metrics @opentelemetry/instrumentation-runtime-node
# OTLP export (choose gRPC or HTTP)
npm install @opentelemetry/exporter-metrics-otlp-grpc
# or
npm install @opentelemetry/exporter-metrics-otlp-http
# Auto-instrumentation (as needed)
npm install @opentelemetry/instrumentation-http @opentelemetry/instrumentation-expressMigration from @manifest-cyber/metrics
npm uninstall @manifest-cyber/metrics
npm install @manifest-cyber/observability-tsKey Features
Service name as a label:
// Service name is automatically added as a label
const counter = createCounter({
name: 'http_requests', // becomes: http_requests_total (auto-suffixed)
help: 'HTTP requests',
});
// Or with full name:
const counter2 = createCounter({
name: 'http_requests_total', // stays: http_requests_total (no duplication)
help: 'HTTP requests',
});
// Result: http_requests_total{service_name="my-service"}Automatic metric name normalization:
Metric names are automatically normalized to follow Prometheus conventions ([a-zA-Z_:][a-zA-Z0-9_:]*):
- Converted to lowercase
- Spaces and hyphens converted to underscores
- Invalid characters replaced with underscores (only alphanumeric, underscore, and colon are allowed)
- Names starting with numbers are prefixed with underscore
// All of these work and are normalized:
createCounter({ name: 'HTTP Requests' }) // → http_requests_total
createCounter({ name: 'http-requests' }) // → http_requests_total
createCounter({ name: 'http_requests_total' }) // → http_requests_total
// Special character handling (invalid chars → underscore):
createCounter({ name: 'http.requests' }) // → http_requests_total (. → _)
createCounter({ name: 'http@count#total' }) // → http_count_total (@ # → _)
createCounter({ name: 'request/response' }) // → request_response_total (/ → _)
createCounter({ name: 'http_requests!' }) // → http_requests__total (! → _)
createCounter({ name: '404_errors' }) // → _404_errors_total (prefixed with _)
createCounter({ name: 'cache:hits' }) // → cache:hits_total (: is allowed)
// Case variants are handled correctly:
createCounter({ name: 'http_requests_TOTAL' }) // → http_requests_total (not duplicated)
createHistogram({ name: 'file_size_BYTES', buckets: BUCKETS.SIZE.SMALL })
// → file_size_bytes (unit detected case-insensitively)Auto-suffixing behavior:
Counter names are auto-suffixed with
_total(you can provide it or not):// Both work: name: 'http_requests' // becomes: http_requests_total name: 'http_requests_total' // stays: http_requests_totalHistogram names are auto-suffixed with unit (default:
_seconds):// Both work: name: 'request_duration' // becomes: request_duration_seconds name: 'request_duration_seconds' // stays: request_duration_seconds // Custom unit: name: 'file_size' // with unit: 'bytes' → file_size_bytes // Unit detection - name wins: name: 'latency_milliseconds' // with unit: 'seconds' → latency_milliseconds (warns)Unit detection priority:
- Unit suffix in name (if present) - e.g.,
_seconds,_bytes unitparameter (if provided)- Default:
_seconds
If name has a unit AND
unitparameter differs, the name's unit is used and a warning is logged.- Unit suffix in name (if present) - e.g.,
Use the
service_namelabel in Prometheus queries to filter by service:http_requests_total{service_name="my-service"}