npm.io
0.2.22 • Published 4d ago

@manifest-cyber/observability-ts

Licence
UNLICENSED
Version
0.2.22
Deps
9
Size
239 kB
Vulns
0
Weekly
934

@manifest-cyber/observability-ts

Unified observability library for Manifest Cyber's TypeScript services - Prometheus metrics and OpenTelemetry tracing.

npm version

Installation

npm install @manifest-cyber/observability-ts

# Optional: for tracing features
npm install @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-grpc

Quick Start

Metrics
import { createCounter, startMetricsServer } from '@manifest-cyber/observability-ts';

const requestsTotal = createCounter({
  name: 'http_requests_total',
  help: 'Total HTTP requests',
  labelNames: ['method', 'status'],
});

await startMetricsServer({ serviceName: 'my-service' });

// Service name is automatically added as a label
requestsTotal.inc({ method: 'GET', status: '200' });
// Results in: http_requests_total{service_name="my-service",method="GET",status="200"}
Tracing
import { initTracing, withSpan } from '@manifest-cyber/observability-ts';

await initTracing({ serviceName: 'my-service' });

await withSpan('process.request', async (span) => {
  span.setAttribute('user.id', userId);
  return await processRequest();
});
Tree-Shakeable Imports
import { createCounter } from '@manifest-cyber/observability-ts/metrics';
import { initTracing } from '@manifest-cyber/observability-ts/tracing';

Features

Metrics (Prometheus)

  • Counter, Gauge, Histogram, Summary
  • Automatic service_name label injection
  • HTTP metrics server (/metrics on port 9090)
  • Timer utilities and operation tracking
  • Type-safe with TypeScript generics

Tracing (OpenTelemetry)

  • W3C Trace Context propagation
  • OTLP export (VictoriaTraces, Jaeger, Tempo)
  • Automatic HTTP/gRPC/database instrumentation
  • Manual span creation with createSpan() and withSpan()
  • SQS and RabbitMQ trace propagation
  • Logger integration for trace correlation

API Overview

Metrics
import { createCounter, createHistogram, createGauge, BUCKETS } from '@manifest-cyber/observability-ts/metrics';

// Counter - automatically suffixed with _total
const counter = createCounter({
  name: 'operations',  // becomes: operations_total
  help: 'Total operations',
  labelNames: ['type', 'status'],
});
// service_name label is automatically added
counter.inc({ type: 'api', status: 'success' });
// Results in: operations_total{service_name="my-service",type="api",status="success"}

// Or provide full name (won't duplicate suffix)
const counter2 = createCounter({
  name: 'http_requests_total',  // stays: http_requests_total
  help: 'Total HTTP requests',
});

// Histogram - automatically suffixed with _seconds (or custom unit)
const histogram = createHistogram({
  name: 'request_duration',  // becomes: request_duration_seconds
  help: 'Request duration',
  labelNames: ['route', 'method'],
  buckets: BUCKETS.DURATION.FAST,  // Pre-configured buckets for fast operations
});
histogram.observe({ route: '/api/users', method: 'GET' }, 0.42);

// Custom unit
const fileSize = createHistogram({
  name: 'file_size',  // becomes: file_size_bytes
  help: 'File size distribution',
  unit: 'bytes',
  buckets: BUCKETS.SIZE.SMALL,  // Pre-configured buckets for file sizes
});

// Or provide full name (won't duplicate suffix)
const duration = createHistogram({
  name: 'processing_duration_milliseconds',  // stays: processing_duration_milliseconds
  help: 'Processing duration',
});

// Unit detection: name wins over parameter
const conflicted = createHistogram({
  name: 'response_time_milliseconds',
  unit: 'seconds',  // ⚠️ Logs warning, uses milliseconds from name
  buckets: BUCKETS.DURATION.FAST,
});
// Result: response_time_milliseconds (name's unit wins)

// Gauge - descriptive name
const gauge = createGauge({
  name: 'active_connections',
  help: 'Active connections',
  labelNames: ['type'],
});
gauge.set({ type: 'http' }, 42);

// Metric names are automatically normalized:
// - Converted to lowercase
// - Spaces and hyphens become underscores
// - Invalid characters replaced with underscores (only alphanumeric, underscore, colon allowed)
// - Must start with letter or underscore
const normalized = createCounter({
  name: 'HTTP Requests',     // → http_requests_total
  help: 'Example of normalization',
});

// Special character handling examples:
createCounter({ name: 'http.requests' })       // → http_requests_total
createCounter({ name: 'http:requests' })       // → http:requests_total (colon allowed)
createCounter({ name: 'http_requests!' })      //http_requests__total (! → _)
createCounter({ name: 'http@requests#count' }) // → http_requests_count_total
createCounter({ name: '123invalid' })          //_123invalid_total (prefixed with _)

// Unit detection for histograms (works after normalization):
// - Names are normalized first (hyphens→underscores, lowercase, etc.)
// - If normalized name has unit suffix (e.g., _seconds, _bytes), it's used
// - If name lacks unit, the 'unit' parameter is appended (default: 'seconds')
// - If both present and differ, name wins with a warning
const autoUnit = createHistogram({
  name: 'latency',  // → latency_seconds (default)
  buckets: BUCKETS.DURATION.HTTP,
});

// Unit detection works with hyphens (normalized first):
createHistogram({ name: 'request-duration-seconds', buckets: BUCKETS.DURATION.FAST })
// → request_duration_seconds (not request_duration_seconds_seconds)

createHistogram({ name: 'file-size-bytes', buckets: BUCKETS.SIZE.MEDIUM })
// → file_size_bytes (detects bytes after normalization)
Pre-configured Histogram Buckets

The library provides BUCKETS constants for common histogram use cases:

import { BUCKETS } from '@manifest-cyber/observability-ts/metrics';

// Duration buckets (in seconds)
BUCKETS.DURATION.FAST      // [10ms, 50ms, 100ms, 500ms, 1s, 2s, 5s]
BUCKETS.DURATION.MEDIUM    // [100ms, 500ms, 1s, 2s, 5s, 10s, 30s, 1m]
BUCKETS.DURATION.LONG      // [1s, 5s, 10s, 30s, 1m, 2m, 5m, 10m, 20m]
BUCKETS.DURATION.DB        // [10ms, 50ms, 100ms, 500ms, 1s, 2s, 5s, 10s]
BUCKETS.DURATION.HTTP      // [1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 2s, 5s]

// Size buckets (in bytes)
BUCKETS.SIZE.SMALL         // [1KB, 10KB, 100KB, 1MB, 10MB, 100MB]
BUCKETS.SIZE.MEDIUM        // [100KB, 1MB, 10MB, 100MB, 1GB]
BUCKETS.SIZE.LARGE         // [1MB, 10MB, 100MB, 1GB, 10GB, 100GB]
BUCKETS.SIZE.RESPONSE      // [1KB, 10KB, 100KB, 1MB, 10MB]

Examples:

// HTTP request duration
const httpDuration = createHistogram({
  name: 'http_request_duration',
  help: 'HTTP request duration',
  buckets: BUCKETS.DURATION.HTTP,
});

// Database query duration
const dbDuration = createHistogram({
  name: 'db_query_duration',
  help: 'Database query duration',
  buckets: BUCKETS.DURATION.DB,
});

// File upload size
const uploadSize = createHistogram({
  name: 'upload_size',
  help: 'Upload file size',
  unit: 'bytes',
  buckets: BUCKETS.SIZE.MEDIUM,
});

// Long-running job duration
const jobDuration = createHistogram({
  name: 'job_duration',
  help: 'Background job duration',
  buckets: BUCKETS.DURATION.LONG,
});
Tracing
import { initTracing, withSpan, createSpan } from '@manifest-cyber/observability-ts/tracing';

// Initialize
await initTracing({
  serviceName: 'my-service',
  exporter: {
    type: 'otlp-grpc',
    endpoint: 'http://localhost:4317',
  },
  sampling: {
    type: 'parentBased',
    parentBased: {
      root: { type: 'traceIdRatio', ratio: 0.1 }, // 10% sampling
    },
  },
});

// Automatic span lifecycle
await withSpan('database.query', async (span) => {
  span.setAttribute('db.statement', 'SELECT * FROM users');
  return await db.query('SELECT * FROM users');
});

// Manual span management
const span = createSpan('manual.operation');
try {
  await doWork();
  span.setStatus({ code: SpanStatusCode.OK });
} finally {
  span.end();
}
Trace Propagation
import { 
  injectTraceContext, 
  extractTraceContext,
  createMessageTraceContext,
  extractMessageTraceContext 
} from '@manifest-cyber/observability-ts/tracing';

// HTTP Client
const headers = {};
injectTraceContext(headers);
await axios.get('https://api.example.com/users', { headers });

// HTTP Server
app.use((req, res, next) => {
  extractTraceContext(req.headers);
  next();
});

// SQS Producer
await sqs.sendMessage({
  QueueUrl: queueUrl,
  MessageBody: JSON.stringify(data),
  MessageAttributes: createMessageTraceContext(),
});

// SQS Consumer
extractMessageTraceContext(message.MessageAttributes);

Environment Variables

Variable Description Default
SERVICE_NAME Service name (added as service_name label to all metrics; also used by initObservability when serviceName is omitted) 'unknown-service'
MFST_METRICS_PORT Metrics server port (also used by initObservability when metricsPort is omitted) 9090
OTEL_EXPORTER_OTLP_ENDPOINT OTLP endpoint 'http://localhost:4317'
OTEL_TRACING_ENABLED Enable/disable tracing true
OTEL_METRICS_EXPORTER Metrics exporter used by initObservability: prometheus (scrape), otlp-grpc/otlp-http (push), or none prometheus
ENV Environment (dev/staging/prod) 'development'

Push-Based Metrics (short-lived processes)

By default, initObservability exposes metrics on a Prometheus scrape endpoint. That model assumes a long-lived process; short-lived workloads (e.g. single-task jobs running as KEDA-scaled Kubernetes Jobs) terminate before they are ever scraped. For those, set OTEL_METRICS_EXPORTER=otlp-grpc (or otlp-http):

  • Metrics are pushed to OTEL_EXPORTER_OTLP_ENDPOINT instead of being scraped, and the Prometheus scrape server is not started.
  • The process exit interceptor performs a final metrics export on process.exit(), so buffered metrics are not lost when the process ends before the periodic export interval (60s) fires.

Process Exit Interceptor

When tracing is enabled (or metrics use a push exporter), process.exit() is automatically intercepted to flush telemetry before termination (default: enabled, 2s timeout). This prevents telemetry loss on abrupt exits.

import { initObservability } from '@manifest-cyber/observability-ts';

await initObservability({
  serviceName: 'my-service',
  interceptProcessExit: true,  // default: true
  exitFlushTimeoutMs: 2000,    // default: 2000ms
});

// Tracing automatically flushed on exit
if (failed) process.exit(1);

Configuration:

  • interceptProcessExit: boolean - Enable/disable (default: true)
  • exitFlushTimeoutMs: number - Flush timeout in ms (default: 2000)
  • logger: Logger - Optional logger for diagnostics

To disable:

await initObservability({
  serviceName: 'my-service',
  interceptProcessExit: false,
});

Example: Express API

import express from 'express';
import {
  createCounter,
  createHistogram,
  BUCKETS,
  startMetricsServer,
  initTracing,
  withSpan,
  extractTraceContext,
} from '@manifest-cyber/observability-ts';

await initTracing({ serviceName: 'api-service' });
await startMetricsServer({ serviceName: 'api-service' });

const httpRequests = createCounter({
  name: 'http_requests_total',
  help: 'Total HTTP requests',
  labelNames: ['method', 'route', 'status'],
});

const httpDuration = createHistogram({
  name: 'http_request_duration_seconds',
  help: 'HTTP request duration',
  labelNames: ['method', 'route', 'status'],
  buckets: BUCKETS.DURATION.HTTP,
});

const app = express();

app.use((req, res, next) => {
  extractTraceContext(req.headers);
  const start = Date.now();

  res.on('finish', () => {
    const duration = (Date.now() - start) / 1000;
    const labels = {
      method: req.method,
      route: req.route?.path || req.path,
      status: res.statusCode.toString(),
    };
    httpRequests.inc(labels);
    httpDuration.observe(labels, duration);
  });

  next();
});

app.get('/api/users/:id', async (req, res) => {
  await withSpan('http.GET /api/users/:id', async (span) => {
    span.setAttribute('user.id', req.params.id);
    const user = await fetchUser(req.params.id);
    res.json(user);
  });
});

app.listen(3000);

Migration from prom-client (v0.2.x → v0.3.x)

v0.3.0 migrates from prom-client to OpenTelemetry SDK for metrics.

Breaking Changes
Change Before After
startMetricsServer() Sync Async (await)
getRegistry() / resetRegistry() Exported Removed
reset() / remove() on metrics Functional No-op (API compat)
prom-client type re-exports Available Removed
New: Auto-instrumentation
await initObservability({
  serviceName: 'my-api',
  autoInstrument: ['http', 'express', 'mongo'],
});
Optional Peer Dependencies
# Runtime metrics
npm install @opentelemetry/host-metrics @opentelemetry/instrumentation-runtime-node

# OTLP export (choose gRPC or HTTP)
npm install @opentelemetry/exporter-metrics-otlp-grpc
# or
npm install @opentelemetry/exporter-metrics-otlp-http

# Auto-instrumentation (as needed)
npm install @opentelemetry/instrumentation-http @opentelemetry/instrumentation-express

Migration from @manifest-cyber/metrics

npm uninstall @manifest-cyber/metrics
npm install @manifest-cyber/observability-ts
Key Features

Service name as a label:

// Service name is automatically added as a label
const counter = createCounter({
  name: 'http_requests',  // becomes: http_requests_total (auto-suffixed)
  help: 'HTTP requests',
});
// Or with full name:
const counter2 = createCounter({
  name: 'http_requests_total',  // stays: http_requests_total (no duplication)
  help: 'HTTP requests',
});
// Result: http_requests_total{service_name="my-service"}

Automatic metric name normalization:

Metric names are automatically normalized to follow Prometheus conventions ([a-zA-Z_:][a-zA-Z0-9_:]*):

  • Converted to lowercase
  • Spaces and hyphens converted to underscores
  • Invalid characters replaced with underscores (only alphanumeric, underscore, and colon are allowed)
  • Names starting with numbers are prefixed with underscore
// All of these work and are normalized:
createCounter({ name: 'HTTP Requests' })        // → http_requests_total
createCounter({ name: 'http-requests' })        // → http_requests_total
createCounter({ name: 'http_requests_total' })  // → http_requests_total

// Special character handling (invalid chars → underscore):
createCounter({ name: 'http.requests' })        //http_requests_total (. → _)
createCounter({ name: 'http@count#total' })     //http_count_total (@ # → _)
createCounter({ name: 'request/response' })     // → request_response_total (/ → _)
createCounter({ name: 'http_requests!' })       //http_requests__total (! → _)
createCounter({ name: '404_errors' })           //_404_errors_total (prefixed with _)
createCounter({ name: 'cache:hits' })           // → cache:hits_total (: is allowed)

// Case variants are handled correctly:
createCounter({ name: 'http_requests_TOTAL' })  //http_requests_total (not duplicated)
createHistogram({ name: 'file_size_BYTES', buckets: BUCKETS.SIZE.SMALL })
// → file_size_bytes (unit detected case-insensitively)

Auto-suffixing behavior:

  1. Counter names are auto-suffixed with _total (you can provide it or not):

    // Both work:
    name: 'http_requests'        // becomes: http_requests_total
    name: 'http_requests_total'  // stays: http_requests_total
  2. Histogram names are auto-suffixed with unit (default: _seconds):

    // Both work:
    name: 'request_duration'          // becomes: request_duration_seconds
    name: 'request_duration_seconds'  // stays: request_duration_seconds
    
    // Custom unit:
    name: 'file_size'                 // with unit: 'bytes' → file_size_bytes
    
    // Unit detection - name wins:
    name: 'latency_milliseconds'      // with unit: 'seconds'latency_milliseconds (warns)

    Unit detection priority:

    1. Unit suffix in name (if present) - e.g., _seconds, _bytes
    2. unit parameter (if provided)
    3. Default: _seconds

    If name has a unit AND unit parameter differs, the name's unit is used and a warning is logged.

  3. Use the service_name label in Prometheus queries to filter by service:

    http_requests_total{service_name="my-service"}

Keywords