0.3.1 • Published 4d ago

arbr-client

Licence

Apache-2.0

Version

0.3.1

Deps

Size

30 kB

Vulns

Weekly

270

Summary Dependency Versions

arbr-client

Official JavaScript client for the Arbr AI control plane — one function to route, observe, and govern every LLM call your app makes.

Your app calls the gateway instead of provider SDKs. The gateway holds the provider keys, honors the model you pin (or picks one when you say "auto"), applies human-approved routing rules and cost policies, and logs every call with full cost attribution — visible in the dashboard.

Zero dependencies. Node ≥ 18 (global fetch). CommonJS, works from ESM via import.
One function for the 90% case — chat().
Robust by default — per-attempt timeouts, retries with exponential backoff + jitter on network errors / 429 / 5xx, typed errors, AbortSignal support.
TypeScript-ready — full type definitions included.

Install

npm install arbr-client
# (pre-release: npm install /path/to/arbr-client-0.1.0.tgz)

60-second quickstart

const { createClient } = require("arbr-client");

const arbr = createClient({
  baseUrl: "http://localhost:4100",   // or set ARBR_GATEWAY_URL
  application: "my-app",              // attribution — shows up in the dashboard
});

const res = await arbr.chat({
  messages: "Summarise this support ticket in two sentences: …",
  model: "auto",                      // let the gateway's router decide
  maxTokens: 300,
});

console.log(res.text);
console.log(res.model, res.routingDecision); // e.g. "gpt-4o-mini", "ai"

That's a complete integration. No provider keys in your app, and every call is logged, costed, and governable from the dashboard.

How model choice works

You send	What happens
`model: "gpt-4o"` (provider connected)	Honored as-is — all routing policies skipped. `routingDecision: "explicit"`
`model: "auto"` or omitted	The gateway decides: cache → operator rules → automated routing (cost guardrail or AI policy) → default model
`model` whose provider isn't connected	Falls back to the router (same as `"auto"`)

res.modelRequested always shows what you asked for, res.model what actually served it, and res.routingDecision why (explicit / rule / auto / ai / cache / fallback / passthrough).

If you don't send a taskType, the gateway infers one (keyword heuristic, or an AI classifier when AI routing is enabled) — res.classifiedBy tells you which (provided / keyword / ai).

API

`createClient(options) → client`

Option	Default	Notes
`baseUrl`	`process.env.ARBR_GATEWAY_URL`	Gateway origin. Required (here or via env).
`apiKey`	`process.env.ARBR_API_KEY`	Gateway API key (`ab_…`, dashboard → Settings → API keys). Sent as `Authorization: Bearer`; binds attribution server-side. Required once the gateway has Require API keys on.
`application`	—	Default attribution merged into every call. Strongly recommended.
`workflow`, `department`, `userId`	—	More default attribution.
`timeoutMs`	`60000`	Per attempt, via `AbortController`.
`retries`	`2`	Network errors / timeouts / 429 / 5xx only. Exponential backoff + jitter.
`fetch`	global `fetch`	Injectable for tests or custom agents.

`client.chat(request) → Promise<ChatResponse>`

Request fields: messages (required), model, provider, taskType, temperature, maxTokens, per-call application/workflow/department/userId (override the defaults), timeoutMs, retries, signal.

messages accepts any of:

"a bare string"                                       // → one user message
[{ role: "system", content: "…" }, { role: "user", content: "…" }]
[someLangChainMessage]                                 // anything with ._getType()

Response: { requestId, model, modelRequested, provider, routingDecision, classifiedBy, cacheHit, text, usage: { inputTokens, outputTokens, totalTokens } }.

Streaming

The gateway supports two streaming modes:

Real SSE (token-by-token) — use the OpenAI-compatible endpoint at POST /v1/chat/completions with stream: true. Works with the OpenAI Node.js SDK, Python SDK, any chat UI, or a raw fetch:

import OpenAI from "openai";

const openai = new OpenAI({ apiKey: "ab_…", baseURL: "http://localhost:4100" });
const stream = await openai.chat.completions.stream({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Tell me a joke" }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

client.stream(request) → AsyncGenerator<{ text }> — the arbr-client stream() method calls POST /v1/chat, collects the full response, and yields it in small text chunks. Full routing metadata is available on the generator's return value. Useful when you want the attribution response (res.model, res.routingDecision, etc.) alongside a streaming-style emit:

for await (const chunk of arbr.stream({ messages: "…" })) {
  process.stdout.write(chunk.text);
}

Use the OpenAI-compat endpoint when you need real token-by-token delivery or are integrating with chat UIs. Use stream() when you want the routing metadata the OpenAI endpoint doesn't expose.

`client.status() → Promise<StatusResponse>`

Healthcheck against GET /api/status: { demoMode, liveProviders, defaultProvider, defaultModel, routingMode, breachedCaps }. When the gateway has admin auth enabled (ARBR_ADMIN_KEY set server-side), this endpoint requires a credential — your gateway apiKey is accepted, so set it and status() keeps working.

`client.models() → Promise<ModelsResponse>`

List every model available on this Arbr instance — GET /v1/models. Uses the same gateway API key as chat calls (no admin key needed).

const { data: models } = await arbr.models();

// Find all light-tier models, sorted by input cost
const cheap = models
  .filter(m => m.tier === "light")
  .sort((a, b) => a.inputPer1M - b.inputPer1M);

console.log(cheap[0].id, cheap[0].provider); // e.g. "us.amazon.nova-micro-v1:0", "bedrock-nova"

Response shape is OpenAI-compatible ({ object: "list", data: [...] }) with Arbr extensions on each model:

Field	Type	Description
`id`	string	Model ID — pass as `model:` in chat calls
`provider`	string	Underlying provider (`"openai"`, `"bedrock-nova"`, `"anthropic"`, …)
`label`	string	Human-readable name
`tier`	`"light"` \| `"mid"` \| `"premium"`	Cost tier
`inputPer1M`	number	USD per 1M input tokens
`outputPer1M`	number	USD per 1M output tokens

Because the response is OpenAI-compatible, it also works directly via the OpenAI Node.js SDK:

import OpenAI from "openai";
const openai = new OpenAI({ apiKey: "ab_…", baseURL: "https://arbr.gyde.ai/v1" });
const list = await openai.models.list();
list.data.forEach(m => console.log(m.id));

`client.providers() → Promise<ProvidersResponse>`

List which providers are live on this instance — GET /v1/providers. Returns { object: "list", data: [{ id, models: string[] }] }. No credentials exposed.

const { data: providers } = await arbr.providers();

for (const p of providers) {
  console.log(p.id, "→", p.models.length, "models");
}
// openai       → 2 models
// bedrock-nova → 11 models
// anthropic    → 3 models

`asLangChainModel(client, meta?) → { invoke, stream }`

For apps that centralise LLM calls behind a factory (the recommended chokepoint pattern): returns a duck-typed LangChain-style chat model — no LangChain dependency — whose .invoke(messages) resolves to an AIMessage-shaped object (content, usage_metadata, response_metadata) and .stream(messages) yields { content } chunks.

const model = asLangChainModel(arbr, { workflow: "answer-drafting", maxTokens: 1024 });
const msg = await model.invoke(messages);   // msg.content, msg.usage_metadata

Error handling

All failures throw GatewayError with status, code, retryable, and (when available) requestId:

`code`	Meaning	Retried automatically?
`invalid_input`	Bad arguments (caught before any network call)	no
`bad_request`	Gateway rejected the request (HTTP 400)	no
`demo_mode`	Gateway has no provider keys configured (HTTP 503)	no
`provider_error`	All providers failed for this call (HTTP 502)	yes (5xx)
`http_error`	Other non-2xx	429/5xx only
`invalid_api_key`	Missing/unknown/revoked gateway API key (HTTP 401)	no
`budget_exceeded`	A budget cap with action Block is breached for your scope (HTTP 429)	no — retrying won't help until the window rolls past
`rate_limited`	Your API key is over its requests/minute limit (HTTP 429)	yes
`network`	Connection failed	yes
`timeout`	Per-attempt timeout elapsed	yes
`aborted`	Your `AbortSignal` fired	no

const { GatewayError } = require("arbr-client");
try {
  await arbr.chat({ messages: "…" });
} catch (err) {
  if (err instanceof GatewayError && err.code === "demo_mode") {
    // gateway is up but has no provider keys — fall back or surface to ops
  } else {
    throw err;
  }
}

Integration recipes

Express handler:

const arbr = createClient({ baseUrl: process.env.ARBR_GATEWAY_URL, application: "support-api" });

app.post("/answer", async (req, res) => {
  const out = await arbr.chat({
    messages: req.body.messages,
    workflow: "answer-drafting",
    model: "auto",
    maxTokens: 1024,
  });
  res.json({ answer: out.text, model: out.model });
});

Gradual rollout behind an env flag (no business-code changes — swap at the factory):

function makeModel(opts) {
  if (!process.env.ARBR_GATEWAY_URL) return buildDirectProviderModel(opts); // unchanged path
  const arbr = createClient({ application: "my-app" });
  return asLangChainModel(arbr, opts);  // .invoke()/.stream() compatible
}

Unset ARBR_GATEWAY_URL to revert instantly.

License

MIT

Keywords

llm ai gateway routing control-plane openai anthropic gemini bedrock cost