@devvibex/aiflow
First-party LLM gateway SDK for apps generated on the VibeX platform. No third-party API key is needed in the browser — every request is routed through the platform gateway, which attaches the provider key server-side and bills the project owner's credit balance.
Install
npm install @devvibex/aiflow
# or
yarn add @devvibex/aiflow
# or
pnpm add @devvibex/aiflowRequires Node.js 18+ (or any modern browser) — uses the native fetch
and ReadableStream APIs.
Quick start
import AIFlowClient from '@devvibex/aiflow';
// No model defaults needed — the SDK fetches them from the server on
// first use (cached for the life of the client). Override only if you
// want to pin a specific model.
const aiflow = new AIFlowClient({
appId: 'YOUR_APP_ID', // injected by the platform gateway
baseUrl: 'https://api.vibe-x.app/aiflow',
});
// Optional — inspect / pre-warm the server config
const config = await aiflow.getConfig();
console.log(config.models.chat); // ["claude-sonnet-4-6", "gpt-4o", ...]
console.log(config.defaults.chat); // "claude-sonnet-4-6"
// Non-streaming chat
const res = await aiflow.chat({
system: 'You are a helpful assistant.',
messages: [{ role: 'user', content: 'Hello, who are you?' }],
});
console.log(res.content);
// Streaming chat
for await (const chunk of aiflow.chatStream({
messages: [{ role: 'user', content: 'Write a haiku about the sea.' }],
})) {
if (chunk.delta) process.stdout.write(chunk.delta);
if (chunk.done) break;
}
// Image generation
const img = await aiflow.generateImage({
prompt: 'a cozy cabin in a snowy forest, watercolour',
size: '1024x1024',
});
console.log(img.images[0].url);
// Embeddings
const emb = await aiflow.embed({ input: ['hello world'] });
console.log(emb.data[0].embedding.length);API
new AIFlowClient(options)
| Option | Type | Default | Notes |
|---|---|---|---|
appId |
string |
required | Public project app id. |
baseUrl |
string |
required | Gateway URL, e.g. https://api.vibe-x.io/aiflow. |
defaultModel |
string |
from GET /config |
Override — wins over server config. |
defaultImageModel |
string |
from GET /config |
Override. |
defaultEmbedModel |
string |
from GET /config |
Override. |
defaultVideoModel |
string |
from GET /config |
Override. |
defaultAudioModel |
string |
from GET /config |
Override (music / Lyria). |
defaultSfxModel |
string |
from GET /config |
Override (sound effects / ElevenLabs). |
endUserId |
string |
null |
Current end-user id; scopes the knowledge base (x-end-user-id). Set/clear later with setEndUser. |
timeout |
number |
60000 |
Per-request timeout (ms). |
maxRetries |
number |
2 |
Retries on 429 / 5xx / network errors. |
fetch |
typeof fetch |
globalThis.fetch |
Custom fetch (Node <18, testing). |
When you don't pass a defaultModel, the SDK calls GET /config on the
first chat / chatStream / generateImage / embed and caches the
response. Concurrent first calls share a single in-flight config fetch.
aiflow.getConfig({ refresh? })
Returns the cached server config, fetching on demand. Pass { refresh: true }
to invalidate the cache.
{
appId: string,
models: {
chat: string[], // e.g. ["claude-sonnet-4-6", "gpt-4o", ...]
image: string[],
embed: string[],
video: string[],
audio: string[], // Lyria — e.g. ["lyria-3-clip-preview", ...]
sfx: string[], // ElevenLabs — e.g. ["eleven_text_to_sound_v2"]
},
defaults: { chat: string, image: string, embed: string, video: string, audio: string, sfx: string },
limits: {
maxTokens: number,
maxImageN: number,
maxEmbedInputs: number,
minSfxDurationSeconds: number,
maxSfxDurationSeconds: number,
},
creditValue: number,
}aiflow.chat({ messages, model?, system?, maxTokens?, temperature?, signal? })
Returns Promise<{ id, model, content, usage: { inputTokens, outputTokens, creditsUsed } }>.
aiflow.chatStream({ ... })
Async iterable of { delta: string, done: boolean } chunks. The final chunk
has done: true and carries the cumulative usage object.
aiflow.generateImage({ prompt, model?, size?, n?, image?, images?, system?, language?, signal? })
Returns { images: [{ url?, b64?, mimeType }], usage: { creditsUsed } }.
url is a short-lived signed URL (~1h). Pass image/images to edit an
existing image (image-to-image) instead of generating from text.
aiflow.generateVideo({ prompt, model?, aspectRatio?, image?, video?, signal? })
Returns { videoUrl, usage: { creditsUsed } }. Pass image for image-to-video
or video for video-to-video; omit both for pure text-to-video.
aiflow.generateAudio({ prompt, model?, negativePrompt?, seed?, signal?, timeout? })
Generate music / background audio from a text prompt (Google Vertex AI Lyria).
Returns { audioUrl, model, durationSeconds, usage: { creditsUsed } }. audioUrl
is a hosted mp3/wav on the CDN. prompt should be a musical description (mood,
genre, tempo/BPM, instrumentation). negativePrompt/seed apply to Lyria 2 only.
aiflow.generateSfx({ prompt, model?, durationSeconds?, signal?, timeout? })
Generate a one-shot sound effect from a text prompt (ElevenLabs). Returns
{ audioUrl, model, durationSeconds, usage: { creditsUsed } }. durationSeconds
is clamped to 0.5–30; omit it to let the model auto-decide. Cache/replay
repeated UI sounds client-side rather than calling per interaction.
aiflow.embed({ input, model?, signal? })
Returns { data: [{ embedding: number[], index }], usage: { creditsUsed } }.
aiflow.setEndUser(endUserId) + aiflow.knowledge.* — per-user knowledge base (RAG)
Let your app's end users upload their own documents; AIFlow vectorizes them transparently and automatically grounds that user's chat answers in the relevant passages. Knowledge is scoped per end user — a user only ever retrieves their own uploads. You don't implement retrieval/embeddings yourself.
// 1. After your app authenticates a user, identify them (sent as x-end-user-id
// on every knowledge AND chat call). Pass null on logout.
aiflow.setEndUser(currentUser.id);
// 2. Upload a document the user picked (pdf, docx, txt, csv, md, xlsx, pptx).
const source = await aiflow.knowledge.ingest(file); // a File/Blob
// → { id, name, status: 'processing', ... } — vectorizes in the background.
// 3. Show the user their library (poll while any is 'processing').
const mine = await aiflow.knowledge.listMine();
// → [{ id, name, status: 'ready'|'processing'|'error', chunk_count, error_message? }]
// 4. Let them remove one.
await aiflow.knowledge.deleteMine(source.id); // → { success: true }
// 5. Just chat — passages from the user's READY docs are injected automatically.
const res = await aiflow.chat({ messages: [{ role: 'user', content: '...' }] });
setEndUseris required beforeknowledge.ingest(uploads are rejected without an end-user id), and chat only grounds in personal docs when it's set. Ingestion is billed to the app owner. Don't tell users their files are "vectorized" — from their view they uploaded a file the assistant can now use.
aiflow.connectLiveTranslate({ targetLang, sourceLang?, mode?, voice?, echoTargetLanguage?, ...handlers })
Real-time speech translation over WebSocket (Gemini 3.5 Live Translate, routed through the gateway — no Google key in the browser). Stream microphone audio in and receive spoken translation + source/target transcripts back, live.
Requires the optional peer dependency
socket.io-client:npm i socket.io-client
Audio-only input — the translate model does not accept text input. Audio
formats are fixed by the model: input = 16 kHz, 16-bit, mono PCM;
output (onAudio) = 24 kHz PCM. The source language is auto-detected
(sourceLang is just an optional hint). Mic capture and playback stay in your
app — the session is transport-only.
const session = await aiflow.connectLiveTranslate({
targetLang: 'ko', // required — translate INTO this language
// sourceLang is optional (auto-detected)
mode: 'speech', // 'speech' (default) → render audio + transcripts; 'text' → transcripts only
onTranscript: ({ role, text }) => {
// role: 'input' = what you said, 'output' = the translation
console.log(role, text);
},
onAudio: ({ chunk }) => {
// base64 24 kHz PCM — decode + queue into a Web Audio buffer to play
playPcm(chunk);
},
onTurnComplete: ({ interrupted }) => {
if (interrupted) stopPlayback(); // user barged in — drop stale audio
},
onError: ({ code, message }) => console.warn(code, message),
onClosed: ({ reason, usage }) => console.log('closed', reason, usage),
});
// Pump 16 kHz mono PCM frames from the mic (base64 string, ArrayBuffer, or typed array)
micProcessor.onaudioframe = (pcm) => session.sendAudio(pcm);
// When the mic turns off / you're done
session.audioEnd();
session.stop();Modes. mode: 'speech' (default) renders spoken translated audio and the
source/translated transcripts. mode: 'text' renders transcripts only
(captions/subtitles). The model always streams audio + transcripts; the mode is
a client-side hint for what to render. echoTargetLanguage (default true)
controls whether audio is still spoken when the input is already in the target
language.
Session API: start(params?), sendAudio(chunk), audioEnd(), stop(),
close(), and on(event, cb) where event is one of
ready | transcript | audio | turn_complete | error | closed. (sendText exists
but the translate model rejects text input.)
Billing. Metered against the project owner's credits per token (audio tokens
cost more than text), settled periodically and on close. If the balance runs
out mid-session you'll get onError({ code: 'INSUFFICIENT_CREDITS' }) and the
session closes.
Error handling
All failures throw AIFlowError with .status, .code, and .requestId.
import { AIFlowError } from '@devvibex/aiflow';
try {
await aiflow.chat({ messages });
} catch (err) {
if (err instanceof AIFlowError && err.status === 402) {
alert('AI credits exhausted — please contact the app owner.');
} else {
throw err;
}
}| Status | Meaning | Retry? |
|---|---|---|
| 401 | Invalid / revoked appId | No — surface as a config error. |
| 402 | Credits exhausted | No — never retry. |
| 429 | Rate limited | Yes, auto (exponential backoff). |
| 5xx | Transient server error | Yes, auto (exponential backoff). |
Cancellation
All methods accept an AbortSignal:
const ctrl = new AbortController();
setTimeout(() => ctrl.abort(), 3000);
await aiflow.chat({ messages, signal: ctrl.signal });Publishing to npmjs
Owned by the
@devvibexorg. You must be a member to publish.
1. One-off setup
# Log in with an account that has access to the @devvibex org
npm loginEnable 2FA on the account — npm requires it for scoped org packages.
2. Bump the version
Follow semver. Bump in sdk/aiflow/package.json:
cd sdk/aiflow
npm version patch # or minor / majornpm version creates a git tag automatically — push it with
git push --follow-tags if you want the tag in the shared repo.
3. Dry-run the tarball
Double-check what will be uploaded before going public:
npm pack --dry-runOnly src/, README.md, LICENSE, and package.json should appear —
the files whitelist in package.json enforces this.
4. Publish
Scoped packages default to private on npm, so explicitly publish as public
(already set via publishConfig in package.json, but pass the flag in
case):
npm publish --access publicnpm will prompt for a 2FA code. On success the package appears at https://www.npmjs.com/package/@devvibex/aiflow.
5. Verify
# From any scratch dir
npm info @devvibex/aiflow
npm install @devvibex/aiflow@latest6. Deprecate / unpublish (if needed)
# Soft-deprecate a bad version
npm deprecate @devvibex/aiflow@1.0.1 'bug in streaming parser — use 1.0.2+'
# Hard unpublish (only within 72h of publishing)
npm unpublish @devvibex/aiflow@1.0.1CI publishing (optional)
Add an NPM_TOKEN (automation token) secret to the GitHub repo, then:
# .github/workflows/publish-aiflow.yml
name: Publish @devvibex/aiflow
on:
push:
tags: ['aiflow-v*']
jobs:
publish:
runs-on: ubuntu-latest
defaults:
run:
working-directory: sdk/aiflow
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
registry-url: 'https://registry.npmjs.org'
- run: npm publish --access public
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}Trigger a release by pushing git tag aiflow-v1.0.1 && git push --tags.
License
MIT