npm.io
0.35.0 • Published 5h ago

pond-ts

Licence
MIT
Version
0.35.0
Deps
1
Size
1.5 MB
Vulns
0
Weekly
829
Stars
2

pond-ts

Highly optimised, fully typed Timeseries library for TypeScript

Schema-driven events, composable batch transforms, push-based streaming ingest, multi-entity partitioning, and an optional React integration — all strict TypeScript end to end, all immutable.

pond-ts is the TypeScript-first successor to pondjs, rewritten from scratch with a focus on type safety, composability, and the live-streaming patterns that pondjs never grew.

npm install pond-ts                 # core
npm install @pond-ts/react          # React hooks (optional)
  • Typed schemas — declare once, every transform downstream narrows off it. event.get('cpu') returns number | undefined straight from the schema; no as casts.
  • Batch + streaming with the same vocabularyfilter, map, aggregate, rolling, diff, rate, fill, cumulative, sample, reduce all exist on both TimeSeries and LiveSeries.
  • Multi-entity by constructionpartitionBy('host') routes per entity; rolling / aggregate / fill / sample over a partitioned view all become per-entity automatically.
  • Bounded-memory streaming — retention policies, eviction-aware views, and sampling decouple downstream window length from event rate at firehose loads (up to 500k events/sec on a single node.js instance.)
  • Triggers — for control of rolling emission cadences. Synchronised partitioned rolling fires across partitions on every boundary.
  • Typed column extractionseries.column('cpu') returns a schema-narrowed typed column with single-pass reductions (min/max/sum/mean/stdev/median/percentile/minMax), index downsampling (bin), and a zero-copy toFloat64Array() for canvas / WebGL draw loops — no per-event allocation on the hot path.
  • No legacy baggage

Quick start: batch

import { Sequence, TimeSeries } from 'pond-ts';

const schema = [
  { name: 'time', kind: 'time' },
  { name: 'cpu', kind: 'number' },
  { name: 'requests', kind: 'number' },
  { name: 'host', kind: 'string' },
] as const;

const cpu = TimeSeries.fromJSON({
  name: 'cpu',
  schema,
  rows: [
    ['2025-01-01T00:00:00Z', 0.31, 120, 'host1'],
    ['2025-01-01T00:01:00Z', 0.44, 135, 'host2'],
    ['2025-01-01T00:02:00Z', 0.52, 141, 'host1'],
    ['2025-01-01T00:03:00Z', 0.48, 128, 'host1'],
    ['2025-01-01T00:04:00Z', 0.63, 166, 'host3'],
  ],
});

const byMinute = cpu.aggregate(Sequence.every('1m'), {
  cpu: 'avg',
  requests: 'sum',
  host: 'last',
});

const bands = cpu.baseline('cpu', { window: '2m', sigma: 2 });
//    ^ appends rolling avg / sd / upper / lower in one pass.

const anomalies = cpu.outliers('cpu', { window: '2m', sigma: 2 });
//    ^ schema-preserving filter — same columns, just the spikes.

The full batch surface (align, rolling, smooth, groupBy, join, reduce, diff, rate, fill, dedupe, materialize, sample, partitionBy, pivotByGroup, …) follows the same shape: TimeSeries in, TimeSeries out, schema preserved.

Quick start: live (streaming)

import { LiveSeries, Sequence } from 'pond-ts';

// 1. Same schema; this is a live append buffer with retention.
const live = new LiveSeries({
  name: 'cpu',
  schema,
  retention: { maxAge: '10m' }, // keep only the last 10 minutes
});

// 2. Push as events arrive. Each push is validated against the schema.
live.push([Date.now(), 0.45, 128, 'api-1']);

// 3. Compose live views — incremental, push-driven, eviction-aware.
const recentAvg = live.rolling('5m', { cpu: 'avg' });
recentAvg.on('event', (e) => render(e.get('cpu')));

// 4. Snapshot to a TimeSeries for batch analytics at any time.
const snap = live.toTimeSeries();

The full live surface (filter, map, select, window, aggregate, rolling, reduce, diff, rate, pctChange, fill, cumulative, sample) is incremental — events flow, views emit, retention bounds memory.

Quick start: multi-entity

partitionBy routes events into per-key buffers. Every stateful operator downstream of partitionBy runs per-partition automatically:

const perHost = cpu
  .partitionBy('host')
  .rolling('5m', { cpu: 'avg', cpu_sd: 'stdev' });

// .collect() fans the per-partition outputs back into a flat TimeSeries
// with the partition key auto-injected as a column.
const flat = perHost.collect();

Same shape on the live side — live.partitionBy('host') returns a LivePartitionedSeries whose rolling / fill / diff / sample methods all maintain per-partition state.

Quick start: bounded-memory sampling

At firehose rates, a long rolling baseline blows the heap. sample({ stride: N }) decouples baseline length from event rate; chain it between partitionBy and rolling:

// Per-host 1-in-10 stride feeding a per-host 5m baseline.
live
  .partitionBy('host')
  .sample({ stride: 10 })
  .rolling('5m', { cpu_avg: 'avg', cpu_sd: 'stdev' });

For visualization, the snapshot side ships reservoir sampling too — single-pass Algorithm R, sorted by key, fixed point count regardless of source size:

const points = series.sample({ reservoir: { size: 500 } }).toRows();
// 500 uncorrelated points drawn uniformly from the source.

Performance

pond-ts is faster on every comparable operation, with no regressions — a ~17x geometric-mean speedup across the measurable ops, plus a handful of transforms (select / rename) that are effectively instant (O(1) column rebinds, below the timer's resolution). The advantage grows with data size.

Category Speedup (N=16k) Notes
Rate ~120x Single columnar walk vs Pipeline
Fill 77–87x Single columnar pass vs Pipeline per strategy
Aggregation 57–82x O(N+B) bucketing vs O(N×B) Pipeline
Statistics 18–80x Typed-array reduce vs ImmutableJS iteration
Alignment 42x Forward cursor vs repeated binary search
Construction 13x Columnar intake vs ImmutableJS wrapping
Chained 8x Derived constructors vs per-step Pipeline
Transforms select/rename instant; collapse 30x; map ~4x Column reshapes vs Pipeline
Event access 6x Array indexing vs ImmutableJS get()
Serialization 4x Lightweight columnar representation

See the full benchmark results for detailed numbers. Run locally:

npm run build && node packages/core/bench/vs-pondjs.cjs

Documentation

The full guide is at https://pjm17971.github.io/pond-ts/.

  • Start here — five-minute walkthrough with batch, live, and React examples.
  • Concepts — temporal keys, sequences, windowing, partitioning, triggers, late data.
  • Transforms reference — every batch operator (queries, aggregation, alignment, rolling, smoothing, sampling, cleaning, reshape, anomaly detection).
  • Live referenceLiveSeries, live transforms, triggering.
  • How-to guides — building a dashboard, ingesting messy data.
  • API reference (auto-generated) — TypeDoc output, every public class and method.
  • CHANGELOG — what shipped in each release.

Examples

  • pond-ts-dashboard — a working React dashboard that streams synthetic per-host CPU / request metrics, computes per-host rolling baselines, flags anomalies against ±σ bands, and renders everything as live line and bar charts (~600 lines of TypeScript). Walked through end-to-end in Building a dashboard.

Develop

The repo is an npm-workspaces monorepo with two published packages (pond-ts, @pond-ts/react). Node 18+ for runtime; Node 20+ for the docs site (Docusaurus).

npm install         # one-time, hoists deps for both packages
npm run build       # build both packages
npm test            # runtime + type-level tests on both packages
npm run format      # prettier write across the repo
npm run verify      # format check + build + test (CI parity)

packages/core/ is the pond-ts package; packages/react/ is @pond-ts/react. Docs live in website/.

License

MIT