2.0.2 • Published 3d ago

@benkhz/context-manager

Licence

MIT

Version

2.0.2

Deps

Size

29 kB

Vulns

Weekly

149

Summary Dependency Versions

@benkhz/context-manager

A vanilla JS class that manages LLM conversation context end-to-end. Zero runtime dependencies.

Installation

npm install @benkhz/context-manager

Quick start

import { AIContextManager, openaiPreset } from '@benkhz/context-manager'

const mgr = new AIContextManager({
  endpoint: 'https://api.openai.com/v1/chat/completions',
  model:    'gpt-4o',
  headers:  { Authorization: `Bearer ${process.env.OPENAI_KEY}` },
  hooks: {
    ...openaiPreset,
  },
})

const reply = await mgr.send('What is the capital of France?')
console.log(reply.content) // "The capital of France is Paris."

Design decisions

Concern	Decision	Rationale
Provider agnosticism	Caller supplies `formatRequest` + `parseResponse` hooks	No lock-in; presets ship for OpenAI and Anthropic as reference
Context compaction	Manager POSTs to the same endpoint with a summarise prompt	Automatic; `onCompact` hook overrides for custom logic
Reactive state	`setState / getState / subscribe` — callbacks fire synchronously	Familiar, zero magic, easy to test
Tool loop cap	10 iterations max	Guards against infinite loops without blocking legitimate multi-step reasoning
Context sizing	Character count approximation	Token counting requires a tokenizer dep; chars are close enough for limit-triggering

Constructor

new AIContextManager(config)

Option	Type	Default	Description
`endpoint`	`string`	required	URL to POST every LLM request to
`hooks`	`object`	required	See Hooks
`model`	`string`	—	Passed through to `formatRequest`
`maxTokens`	`number`	—	Passed through to `formatRequest`
`contextLimit`	`number`	`80_000`	Char count that triggers auto-compaction
`compactKeepLast`	`number`	`6`	Messages preserved verbatim after compaction
`injectSummary`	`boolean`	`true`	Auto-prepend the latest summary as a `system` message on every LLM request. Set `false` to place it yourself via `context.summary` in `formatRequest`.
`headers`	`object`	`{}`	Extra HTTP headers on every `fetch` call

Hooks

All hooks live in config.hooks. Only formatRequest and parseResponse are required.

Required

`formatRequest(context, config) → RequestBody`

Converts the internal context snapshot into the HTTP request body.

// context shape
{
  messages: [{ role, content, toolCalls?, toolCallId? }],
  tools:    [{ name, schema: { description, parameters } }],
  summary:  string | null,
}

// config shape (your constructor options + per-call system override)
{ endpoint, model, maxTokens, headers, system? }

`parseResponse(rawJson) → ParsedResponse`

Converts the raw HTTP response JSON into the internal shape.

// must return
{
  content:    string,                              // assistant text
  stopReason: string,                              // e.g. 'stop', 'tool_use'
  toolCalls?: [{ id, name, args }],               // present when model calls tools
}

Optional lifecycle hooks

Hook	Signature	Description
`beforeSend`	`(messages[]) → messages[]`	Transform or filter the message array before each POST. Return the array.
`afterReceive`	`(parsed) → parsed`	Transform the parsed response before tool/message processing.
`onCompact`	`(overflow[], currentSummary) → string \| void`	Return a summary string to override the LLM-generated one.
`onToolCall`	`(name, args) → args \| void`	Intercept before a tool runs. Return new args to override.
`onToolResult`	`(name, result) → result`	Transform a tool result before appending to context.
`onError`	`(error, phase) → void`	Called on any error. `phase` is `'send'`, `'compact'`, or `'tool'`.
`onStateChange`	`(key, oldVal, newVal) → void`	Observe every state mutation globally.
`onContextLimit`	`(charCount, limit) → 'compact' \| 'truncate' \| 'error'`	Choose what happens when the context limit is hit. Defaults to `'compact'`.

Messaging API

const reply = await mgr.send('your message')
// → { role: 'assistant', content: string }

await mgr.send('follow-up', { system: 'You are a pirate.' })

await mgr.compact()

Tool API

mgr.addTool(
  'getWeather',
  {
    description: 'Get current weather for a city',
    parameters: {
      type: 'object',
      properties: { city: { type: 'string', description: 'City name' } },
      required: ['city'],
    },
  },
  async ({ city }) => ({ temp: 72, unit: 'F', condition: 'sunny' })
)

mgr.removeTool('getWeather')
mgr.getTools()   // → [{ name, schema }]

Event bus

mgr.on('message:sent',     ({ message }) => ...)
mgr.on('message:received', ({ message }) => ...)
mgr.on('tool:call',        ({ name, args }) => ...)
mgr.on('tool:result',      ({ name, result }) => ...)
mgr.on('context:compact',  ({ messageCount }) => ...)
mgr.on('context:compacted',({ summary }) => ...)
mgr.on('state:change',     ({ key, oldValue, newValue }) => ...)
mgr.on('error',            ({ error, phase }) => ...)

mgr.off('message:received', handler)
mgr.once('message:received', handler)

Reactive state

mgr.setState('userId', 'u_123')
mgr.getState('userId')              // → 'u_123'

const unsub = mgr.subscribe('userId', (newVal, oldVal) => {
  console.log(`userId changed: ${oldVal} → ${newVal}`)
})
unsub()

Introspection

mgr.getMessages()        // → Message[] — full, never-pruned turn history
mgr.getActiveMessages()  // → Message[] — current LLM-facing window (post-compaction)
mgr.getSummary()         // → string | null — latest summary
mgr.getSummaries()       // → string[] — every summary ever produced, oldest first
mgr.getTools()           // → [{ name, schema }]
mgr.getContext()         // → { messages, activeMessages, summary, summaries, tools }
mgr.reset()              // clear all message/summary state — returns this

getMessages() always returns every turn ever sent or received, even after compaction has shrunk the LLM-facing window — useful for rendering a full conversation transcript in a UI. getActiveMessages() returns what's actually being sent to the model right now.

Presets

import { openaiPreset, anthropicPreset } from '@benkhz/context-manager'

// OpenAI / Azure / Ollama / LM Studio
const mgr = new AIContextManager({
  hooks: {
    ...openaiPreset,
    beforeSend: msgs => msgs.filter(m => m.content),
  },
})

// Anthropic Messages API
const mgr = new AIContextManager({
  hooks: { ...anthropicPreset },
})

Context compaction

The manager tracks two parallel message lists: the full history (everything ever sent or received, exposed via getMessages()) and the active window (getActiveMessages()) — the slice actually sent to the LLM, which compaction and truncation shrink. History is never pruned.

This check runs at the start of every send() call, and also between tool-call iterations within a single turn — a request that triggers several tool calls in a row can grow the active window past contextLimit well before the turn finishes, so compaction can kick in mid-turn rather than waiting for the next send().

When the character count of the active window exceeds contextLimit:

onContextLimit hook is called — returns 'compact' (default), 'truncate', or 'error'
If compact: the overflow messages are sent to the LLM with a summarise prompt
The summary is stored (and appended to the summary history); the last compactKeepLast active messages are kept verbatim
On the next request, the latest summary is auto-prepended as a system message ahead of the active window — unless injectSummary: false, in which case you place it yourself via context.summary in formatRequest

The onCompact hook can return a string to bypass the LLM call entirely.

Both compaction and truncation snap their cut point to avoid splitting a tool-call/tool-result pair across the boundary — the kept window never starts with an orphaned tool message.

Open decisions

Token counting — currently approximated via character count. A future tokenizer option could accept a (messages) => number fn for more accurate limiting.
Streaming — send() is request/response only.
Persistence — getContext() returns a serialisable snapshot. A future loadContext(snapshot) method would complete the persistence story.
Multi-modal — content is currently assumed to be string.

Keywords

llm ai context-manager openai anthropic tool-calling agent