npm.io
0.17.31 • Published 2d ago

ai.libx.js

Licence
MIT
Version
0.17.31
Deps
4
Size
1.3 MB
Vulns
0
Weekly
1.1K

ai.libx.js

A unified, stateless API bridge for various AI models including LLMs, image/video generation, TTS, and STT. Edge-compatible and designed for use in serverless environments like Vercel Edge Functions and Cloudflare Workers.

Features

  • Unified API - Single interface for multiple AI providers
  • 12 Providers - OpenAI, Anthropic, Google, Groq, Mistral, Cohere, XAI, DeepSeek, AI21, OpenRouter, Cloudflare, Moonshot
  • Streaming Support - Real-time streaming responses from all compatible providers
  • Plain Text Mode - Raw text output without JSON wrapping
  • Smart Model Resolution - Use short aliases like claude, gpt4o, gemini instead of full names
  • Model Normalization - Intelligent alias resolution (e.g., gpt-5chatgpt-4o-latest)
  • Reasoning Model Support - Automatic detection and parameter adjustment for o1/o3/R1 models
  • Request Logging - Built-in metrics tracking with detailed statistics
  • Multimodal - Images (Anthropic, OpenAI, Google), audio (OpenAI, Google), video (Google); text-only adapters throw clearly
  • Stateless Design - No state management, pass API keys per request or globally
  • Edge Compatible - Works with Vercel Edge Functions and Cloudflare Workers
  • Tree-Shakeable - Import only what you need
  • Type-Safe - Full TypeScript support with comprehensive types
  • Zero Dependencies - No external runtime dependencies

Installation

npm install ai.libx.js

Usage

Pattern 1: Generic Client (Runtime Model Selection)
import AIClient from 'ai.libx.js';

// Initialize with API keys and enable logging
const ai = new AIClient({
    apiKeys: {
        openai: process.env.OPENAI_API_KEY,
        anthropic: process.env.ANTHROPIC_API_KEY,
        google: process.env.GOOGLE_API_KEY,
    },
    enableLogging: true, // Track metrics
});

// Non-streaming chat (with smart model resolution)
const response = await ai.chat({
    model: 'gpt4o', // Short alias instead of 'openai/gpt-4o'
    messages: [{ role: 'user', content: 'Hello!' }],
});

console.log(response.content);
console.log(ai.getStats()); // View metrics

// Streaming chat
const stream = await ai.chat({
    model: 'sonnet', // Short alias instead of 'anthropic/claude-3-5-sonnet-latest'
    messages: [{ role: 'user', content: 'Write a story' }],
    stream: true,
});

for await (const chunk of stream) {
    process.stdout.write(chunk.content);
}

// Plain text mode (raw output)
const plainResponse = await ai.chat({
    model: 'openai/gpt-4o',
    messages: [{ role: 'user', content: 'Hello!' }],
    plain: true, // Returns plain text
});
Pattern 2: Direct Provider Adapter
import { OpenAIAdapter, AnthropicAdapter } from 'ai.libx.js/adapters';

// Work directly with a specific provider
const openai = new OpenAIAdapter({
    apiKey: process.env.OPENAI_API_KEY,
});

const response = await openai.chat({
    model: 'gpt-4o', // No vendor prefix needed
    messages: [{ role: 'user', content: 'Hello!' }],
    temperature: 0.7,
});

API Reference

AIClient
Constructor Options
const ai = new AIClient({
    apiKeys?: Record<string, string>;        // API keys by provider
    baseUrls?: Record<string, string>;       // Custom base URLs
    cloudflareAccountId?: string;            // For Cloudflare Workers AI
});
chat(options)
await ai.chat({
    model: string;                  // Format: "provider/model-name"
    messages: Message[];            // Conversation messages
    apiKey?: string;                // Override global API key
    temperature?: number;           // 0-2, default varies by provider
    maxTokens?: number;             // Max tokens to generate
    topP?: number;                  // 0-1, nucleus sampling
    topK?: number;                  // Top-k sampling
    frequencyPenalty?: number;      // -2 to 2
    presencePenalty?: number;       // -2 to 2
    stop?: string | string[];       // Stop sequences
    stream?: boolean;               // Enable streaming
    providerOptions?: object;       // Provider-specific options
});
Supported Providers
Provider Prefix Models
OpenAI openai/ GPT-4, GPT-3.5, etc.
Anthropic anthropic/ Claude 3/4 series
Google google/ Gemini 1.0/1.5/2.0
Groq groq/ LLaMA, Mixtral, Gemma
Mistral mistral/ Mistral, Mixtral series
Cohere cohere/ Command series
XAI xai/ Grok series
DeepSeek deepseek/ DeepSeek V4 (Flash/Pro), legacy chat/reasoner
AI21 ai21/ Jamba
OpenRouter openrouter/ Multi-model proxy
Cloudflare cloudflare/ Workers AI models
Moonshot moonshot/ Kimi K2 series
Types
interface Message {
    role: 'system' | 'user' | 'assistant' | 'tool';
    content: string | ContentPart[]; // string or multimodal parts
    name?: string;
    tool_call_id?: string;
}

// Multimodal content part
interface ContentPart {
    type: 'text' | 'image_url' | 'audio_url' | 'video_url';
    text?: string;
    image_url?: { url: string; detail?: 'auto' | 'low' | 'high' };
    audio_url?: { url: string; format?: string }; // data: URI or Files API URI
    video_url?: { url: string }; // data: URI or Files API URI
}

interface ChatResponse {
    content: string;
    finishReason?: string;
    usage?: {
        promptTokens: number;
        completionTokens: number;
        totalTokens: number;
    };
    model: string;
    raw?: any; // Original provider response
}

interface StreamChunk {
    content: string;
    finishReason?: string;
    index?: number;
}

Multimodal

Use helper functions or build ContentPart[] arrays directly.

import { createVisionMessage, createAudioMessage, createVideoMessage } from 'ai.libx.js';

// Image — Anthropic + OpenAI accept public URLs; Google requires data URI or Files API URI
const msg = createVisionMessage('What is in this image?', 'https://example.com/photo.jpg');
// or base64: createVisionMessage('Describe this', 'data:image/png;base64,...')

// Audio — base64 data URI (all providers); Google also accepts Files API URIs
// OpenAI supported formats: wav, mp3, flac, opus, pcm16
const msg = createAudioMessage('Transcribe this', 'data:audio/wav;base64,...');

// Video — Google only; data URI (≤100 MB inline) or Files API URI
const msg = createVideoMessage('Summarize this video', 'data:video/mp4;base64,...');
Provider multimodal support
Provider Images Audio Video
Anthropic URL, base64, or Files API (anthropic-file://file_id)
OpenAI URL or base64 base64 only
Google base64 / Files API base64 / Files API base64 / Files API
All others throws throws throws

Anthropic Files API: Upload an image once via POST https://api.anthropic.com/v1/files (with header anthropic-beta: files-api-2025-04-14), then reuse the file_id across requests with createAnthropicFileMessage(text, fileId) or by passing anthropic-file://<file_id> as the image URL. The beta header is added automatically.

Size limits: Anthropic images ≤5 MB; Google audio inline ≤20 MB / 9.5 h, video inline ≤100 MB. Use the Google Files API for larger media.

Provider-Specific Notes

Cloudflare Workers AI

Requires account ID:

const ai = new AIClient({
    apiKeys: { cloudflare: 'YOUR_API_KEY' },
    cloudflareAccountId: 'YOUR_ACCOUNT_ID',
});
OpenRouter

Supports custom headers:

await ai.chat({
    model: 'openrouter/meta-llama/llama-3-70b-instruct',
    messages: [...],
    providerOptions: {
        httpReferer: 'https://yourapp.com',
        xTitle: 'Your App Name',
    }
});

Edge Runtime Compatibility

This library is designed to work in edge environments:

// Vercel Edge Function
export const config = { runtime: 'edge' };

export default async function handler(req: Request) {
    const ai = new AIClient({
        apiKeys: { openai: process.env.OPENAI_API_KEY },
    });

    const response = await ai.chat({
        model: 'openai/gpt-4o-mini',
        messages: [{ role: 'user', content: 'Hello!' }],
    });

    return new Response(JSON.stringify(response));
}

Error Handling

import {
    AILibError,
    AuthenticationError,
    InvalidRequestError,
    RateLimitError,
    ModelNotFoundError,
    ProviderError
} from 'ai.libx.js';

try {
    const response = await ai.chat({...});
} catch (error) {
    if (error instanceof AuthenticationError) {
        console.error('Invalid API key');
    } else if (error instanceof RateLimitError) {
        console.error('Rate limit exceeded');
    } else if (error instanceof ModelNotFoundError) {
        console.error('Model not found');
    }
}

Model Utilities

Model Registry
import { supportedModels, getModelInfo, listModels, isModelSupported, getProviderFromModel } from 'ai.libx.js';

// Get model info
const info = getModelInfo('openai/gpt-4o');
console.log(info?.displayName); // "GPT-4o"

// List all models for a provider
const openaiModels = listModels('openai');

// Check if model is supported
if (isModelSupported('anthropic/claude-3-5-sonnet-latest')) {
    // ...
}

// Extract provider from model string
const provider = getProviderFromModel('openai/gpt-4o'); // "openai"
Model Resolution (Fuzzy Matching)

Use short aliases or partial names instead of full model identifiers:

import { resolveModel } from 'ai.libx.js';

// Common aliases
resolveModel('claude');    //'anthropic/claude-haiku-4-5'
resolveModel('sonnet');    //'anthropic/claude-sonnet-4-5'
resolveModel('fable');     //'anthropic/claude-fable-5'
resolveModel('opus');      //'anthropic/claude-opus-4-8'
resolveModel('gpt4o');     //'openai/gpt-4o'
resolveModel('gemini');    //'google/models/gemini-2.5-flash'
resolveModel('llama4');    //'groq/meta-llama/llama-4-scout-17b-16e-instruct'
resolveModel('deepseek');  //'deepseek/deepseek-v4-flash' (fuzzy match; use full ID to pin)
resolveModel('grok3');     //'xai/grok-3-beta'

// Exact matches pass through unchanged
resolveModel('openai/gpt-4o'); //'openai/gpt-4o'

// Non-existent models return unchanged
resolveModel('invalid');   //'invalid'

// Use directly in chat
const ai = new AIClient({ apiKeys: {...} });
await ai.chat({
    model: 'claude',  // Automatically resolved to full model name
    messages: [{ role: 'user', content: 'Hello!' }]
});

Features:

  • Case-insensitive matching
  • Normalized matching (e.g., gpt4ogpt-4o)
  • Partial name matching
  • Display name matching
  • Skips disabled models automatically
  • Returns original input if no match found (fail-safe)
Model Normalization
import { normalizeModelName, isReasoningModel, supportsSystemMessages, getReasoningModelAdjustments, requiresMaxCompletionTokens } from 'ai.libx.js';

// Resolve model aliases
normalizeModelName('gpt-5'); //'chatgpt-4o-latest'
normalizeModelName('claude-4'); //'claude-sonnet-4-0'
normalizeModelName('gemini'); //'models/gemini-2.0-flash'

// Check for reasoning models
isReasoningModel('openai/o1-preview'); // true
isReasoningModel('deepseek/deepseek-reasoner'); // true

// Check system message support
supportsSystemMessages('openai/o1-preview'); // false (o1 doesn't support)
supportsSystemMessages('openai/gpt-4o'); // true

// Check if model requires max_completion_tokens
requiresMaxCompletionTokens('openai/gpt-5-nano'); // true (GPT-5 models)
requiresMaxCompletionTokens('openai/o1-preview'); // true (o1/o3 models)
requiresMaxCompletionTokens('openai/gpt-4o'); // false (standard models)

// Get required parameter adjustments
const adjustments = getReasoningModelAdjustments('openai/o3-mini');
// { temperature: 1, topP: 1, useMaxCompletionTokens: true }
Request Logging
const ai = new AIClient({
    apiKeys: { openai: 'sk-...' },
    enableLogging: true
});

// Make requests...
await ai.chat({ model: 'openai/gpt-4o', messages: [...] });

// Get statistics
const stats = ai.getStats();
console.log(stats);
// {
//   totalRequests: 10,
//   successfulRequests: 9,
//   failedRequests: 1,
//   averageLatency: 1234,
//   totalTokensUsed: 12500,
//   providerBreakdown: {
//     openai: { requests: 6, avgLatency: 1100, tokens: 8000 }
//   }
// }

Examples

See example.ts for complete usage examples.

Supported Models

Supported Models

Generated on 2026-06-27 · Pricing normalized to per 1M tokens

Blend /1M = comparable $/1M: 75% × input + 25% × output when both are billed; if only input or only output is used (e.g. embeddings, some TTS), that side is shown alone.

Rows are sorted most expensive → cheapest by Blend /1M. Models without blend pricing list last.

Model ID Display Name Context Max Output Input /1M Output /1M Cache /1M Blend /1M Capabilities Release Deprecated
openai/o1-pro o1 Pro 200K 100K $150.0000 $600.0000 - $262.5000 reasoning, no-system 2025-03-01 -
openai/gpt-4-32k GPT-4 32k 33K 4K $60.0000 $120.0000 - $75.0000 - 2023-06-27 2024-01-12
openai/gpt-4-32k-0613 GPT-4 32k 0613 33K 4K $60.0000 $120.0000 - $75.0000 - 2023-06-13 2024-01-12
openai/gpt-5.4-pro GPT-5.4 Pro 1.1M 128K $30.0000 $180.0000 - $67.5000 reasoning, vision 2026-03-05 -
openai/gpt-5.4-pro-2026-03-05 GPT-5.4 Pro (2026-03-05) 1.1M 128K $30.0000 $180.0000 - $67.5000 reasoning, vision 2026-03-05 -
openai/gpt-5.5-pro GPT-5.5 Pro 1.1M 128K $30.0000 $180.0000 - $67.5000 reasoning, vision 2026-04-23 -
openai/gpt-5.5-pro-2026-04-23 GPT-5.5 Pro (2026-04-23) 1.1M 128K $30.0000 $180.0000 - $67.5000 reasoning, vision 2026-04-23 -
openai/gpt-5.2-pro GPT-5.2 Pro 400K 16K $21.0000 $168.0000 - $57.7500 reasoning 2026-01-01 -
openai/gpt-5-pro GPT-5 Pro 400K 16K $15.0000 $120.0000 - $41.2500 reasoning 2025-06-01 -
openai/gpt-4 GPT-4 8K 4K $30.0000 $60.0000 - $37.5000 - 2023-03-14 -
openai/gpt-4-0314 GPT-4 0314 8K 4K $30.0000 $60.0000 - $37.5000 - 2023-03-14 2023-06-27
openai/gpt-4-0613 GPT-4 0613 8K 4K $30.0000 $60.0000 - $37.5000 - 2023-06-13 -
anthropic/claude-opus-4-0 Claude Opus 4 200K 32K $15.0000 $75.0000 - $30.0000 reasoning 2025-05-14 2026-06-15
anthropic/claude-opus-4-1 Claude Opus 4.1 200K 32K $15.0000 $75.0000 - $30.0000 reasoning 2025-08-05 2026-08-05
anthropic/claude-opus-4-1-20250805 Claude Opus 4.1 200K 32K $15.0000 $75.0000 - $30.0000 reasoning 2025-08-05 2026-08-05
anthropic/claude-opus-4-20250514 Claude Opus 4 200K 32K $15.0000 $75.0000 - $30.0000 reasoning 2025-05-14 2026-06-15
cloudflare/@cf/deepgram/aura-2-en Aura 2 EN - - $30.0000 $0.0000 - $30.0000 audio, no-chat - -
cloudflare/@cf/deepgram/aura-2-es Aura 2 ES - - $30.0000 $0.0000 - $30.0000 audio, no-chat - -
openrouter/anthropic/claude-opus-4-1 Claude Opus 4.1 (via OpenRouter) 200K 8K $15.0000 $75.0000 - $30.0000 - 2024-12-19 -
openai/o1 o1 200K 100K $15.0000 $60.0000 - $26.2500 reasoning, no-system 2024-12-20 -
openai/o1-preview o1 Preview 128K 4K $15.0000 $60.0000 - $26.2500 reasoning, no-system 2024-09-12 -
anthropic/claude-fable-5 Claude Fable 5 1.0M 128K $10.0000 $50.0000 - $20.0000 reasoning, vision 2026-06-09 -
anthropic/claude-mythos-5 Claude Mythos 5 1.0M 128K $10.0000 $50.0000 - $20.0000 reasoning, vision 2026-06-09 -
openrouter/anthropic/claude-fable-5 Claude Fable 5 (via OpenRouter) 1.0M 128K $10.0000 $50.0000 - $20.0000 reasoning, vision 2026-06-09 -
xai/grok-imagine-image-pro Grok Imagine Image Pro - - $20.0000 $20.0000 - $20.0000 vision, image-gen - 2026-05-15
openai/o3-deep-research o3 Deep Research 200K 100K $10.0000 $40.0000 - $17.5000 reasoning, no-system 2025-06-01 -
cloudflare/@cf/deepgram/aura-1 Aura 1 - - $15.0000 $0.0000 - $15.0000 audio, no-chat - -
openai/gpt-4-0125-preview GPT-4 25-01 (gpt-4-0125-preview) 128K 4K $10.0000 $30.0000 - $15.0000 - 2024-01-25 -
openai/gpt-4-1106-preview GPT-4 06-11 (gpt-4-1106-preview) 128K 4K $10.0000 $30.0000 - $15.0000 vision 2023-11-06 -
openai/gpt-4-turbo GPT-4 Turbo with Vision 128K 4K $10.0000 $30.0000 - $15.0000 vision 2023-11-06 -
openai/gpt-4-turbo-2024-04-09 GPT-4 Turbo with Vision (2024-04-09) 128K 4K $10.0000 $30.0000 - $15.0000 vision 2024-04-09 -
openai/gpt-4-turbo-preview GPT-4 Turbo (gpt-4-turbo-preview) 128K 4K $10.0000 $30.0000 - $15.0000 vision 2023-11-06 -
openai/gpt-4-vision-preview GPT-4 Vision 128K 4K $10.0000 $30.0000 - $15.0000 vision 2023-11-06 -
cloudflare/@cf/black-forest-labs/flux-2-klein-9b FLUX.2 Klein 9B - - $15.0000 $2.0000 - $11.7500 image-gen, no-chat - -
openai/gpt-5.5 GPT-5.5 1.1M 128K $5.0000 $30.0000 $0.5000 $11.2500 reasoning, vision 2026-04-23 -
openai/gpt-5.5-2026-04-23 GPT-5.5 (2026-04-23) 1.1M 128K $5.0000 $30.0000 $0.5000 $11.2500 reasoning, vision 2026-04-23 -
groq/canopylabs/orpheus-arabic-saudi Orpheus Arabic Saudi 8K 8K $0.0010 $40.0000 - $10.0008 - 2025-01-01 -
anthropic/claude-opus-4-5 Claude Opus 4.5 200K 64K $5.0000 $25.0000 - $10.0000 reasoning 2025-11-01 -
anthropic/claude-opus-4-5-20251101 Claude Opus 4.5 200K 64K $5.0000 $25.0000 - $10.0000 reasoning 2025-11-01 -
anthropic/claude-opus-4-6 Claude Opus 4.6 1.0M 128K $5.0000 $25.0000 - $10.0000 reasoning 2026-02-17 -
anthropic/claude-opus-4-7 Claude Opus 4.7 1.0M 128K $5.0000 $25.0000 - $10.0000 reasoning, vision 2026-04-01 -
anthropic/claude-opus-4-8 Claude Opus 4.8 1.0M 128K $5.0000 $25.0000 - $10.0000 reasoning, vision 2026-05-28 -
openai/gpt-5-image GPT-5 Image 400K 16K $10.0000 $10.0000 - $10.0000 vision, image-gen 2025-08-01 -
openrouter/anthropic/claude-opus-4-7 Claude Opus 4.7 (via OpenRouter) 1.0M 128K $5.0000 $25.0000 - $10.0000 vision 2026-04-01 -
openrouter/anthropic/claude-opus-4.8 Claude Opus 4.8 (via OpenRouter) 1.0M 128K $5.0000 $25.0000 - $10.0000 reasoning, vision 2026-05-28 -
xai/grok-2-image-1212 Grok 2 Image (12-12) - - $10.0000 $10.0000 - $10.0000 image-gen 2024-12-12 -
xai/grok-imagine-image Grok Imagine Image - - $10.0000 $10.0000 - $10.0000 vision, image-gen - -
cloudflare/@cf/deepgram/flux Deepgram Flux (ASR) - - $7.7000 $0.0000 - $7.7000 audio, no-chat - -
anthropic/claude-3-7-sonnet-20250219 Claude Sonnet 3.7 200K 64K $3.0000 $15.0000 - $6.0000 reasoning 2025-02-19 -
anthropic/claude-3-7-sonnet-latest Claude Sonnet 3.7 200K 64K $3.0000 $15.0000 - $6.0000 reasoning 2025-02-19 -
anthropic/claude-sonnet-4-0 Claude Sonnet 4 200K 64K $3.0000 $15.0000 - $6.0000 reasoning 2025-05-14 2026-06-15
anthropic/claude-sonnet-4-20250514 Claude Sonnet 4 200K 64K $3.0000 $15.0000 - $6.0000 reasoning 2025-05-14 2026-06-15
anthropic/claude-sonnet-4-5 Claude Sonnet 4.5 200K 64K $3.0000 $15.0000 - $6.0000 reasoning 2025-09-29 -
anthropic/claude-sonnet-4-5-20250929 Claude Sonnet 4.5 200K 64K $3.0000 $15.0000 - $6.0000 reasoning 2025-09-29 -
anthropic/claude-sonnet-4-6 Claude Sonnet 4.6 1.0M 64K $3.0000 $15.0000 - $6.0000 reasoning 2026-02-17 -
cursor/composer-2.5-fast Composer 2.5 Fast 262K - $3.0000 $15.0000 - $6.0000 reasoning 2026-05-18 -
xai/grok-3 Grok 3 131K 8K $3.0000 $15.0000 - $6.0000 - - 2026-05-15
xai/grok-4-0709 Grok 4 (07-09) 256K 16K $3.0000 $15.0000 - $6.0000 reasoning, vision 2025-07-09 2026-05-15
openai/gpt-5.4 GPT-5.4 1.1M 128K $2.5000 $15.0000 - $5.6250 reasoning, vision 2026-03-05 -
openai/gpt-5.4-2026-03-05 GPT-5.4 (2026-03-05) 1.1M 128K $2.5000 $15.0000 - $5.6250 reasoning, vision 2026-03-05 -
groq/canopylabs/orpheus-v1-english Orpheus v1 English 8K 8K $0.0010 $22.0000 - $5.5008 - 2025-01-01 -
cloudflare/@cf/leonardo/lucid-origin Lucid Origin - - $6.9960 $0.1320 - $5.2800 image-gen, no-chat - -
cloudflare/@cf/deepgram/nova-3 Nova 3 - - $5.2000 $0.0000 - $5.2000 audio, no-chat - -
openai/gpt-5.2 GPT-5.2 400K 128K $1.7500 $14.0000 - $4.8125 - 2025-12-01 2026-06-05
openai/gpt-5.3-chat-latest GPT-5.3 Instant 128K 16K $1.7500 $14.0000 - $4.8125 reasoning, vision 2026-03-03 -
openai/gpt-5.3-codex GPT-5.3 Codex 400K 128K $1.7500 $14.0000 - $4.8125 reasoning, vision 2026-03-03 -
google/models/gemini-3-pro-image-preview Gemini 3 Pro Image Preview 66K 33K $2.0000 $12.0000 - $4.5000 vision, image-gen 2026-02-01 -
google/models/gemini-3-pro-preview Gemini 3 Pro Preview 1.0M 66K $2.0000 $12.0000 - $4.5000 reasoning, vision, audio, video 2026-01-01 2026-03-09
google/models/gemini-3.1-pro-preview Gemini 3.1 Pro Preview 1.0M 66K $2.0000 $12.0000 - $4.5000 reasoning, vision, audio, video 2026-02-09 -
cloudflare/@cf/leonardo/phoenix-1.0 Phoenix 1.0 - - $5.8300 $0.1100 - $4.4000 image-gen, no-chat - -
openai/chatgpt-4o-latest ChatGPT GPT-4o 128K 4K $2.5000 $10.0000 - $4.3750 vision 2024-05-13 -
openai/gpt-4o GPT-4o 128K 4K $2.5000 $10.0000 - $4.3750 vision 2024-05-13 -
openai/gpt-4o-2024-05-13 GPT-4o (2024-05-13) 128K 4K $2.5000 $10.0000 - $4.3750 vision 2024-05-13 -
openai/gpt-4o-audio-preview GPT-4o Audio Preview 128K 4K $2.5000 $10.0000 - $4.3750 audio 2024-12-17 -
openai/gpt-4o-realtime-preview GPT-4o Realtime Preview 128K 4K $2.5000 $10.0000 - $4.3750 audio, no-chat 2024-12-17 -
openrouter/openai/gpt-4o GPT-4o (via OpenRouter) 128K 4K $2.5000 $10.0000 - $4.3750 vision 2024-05-13 -
xai/grok-2-vision-1212 Grok 2 Vision (12-12) 33K 8K $2.0000 $10.0000 - $4.0000 vision 2024-12-12 -
ai21/jamba-1.5-large Jamba 1.5 Large 256K 4K $2.0000 $8.0000 - $3.5000 - 2024-08-19 2025-05-06
ai21/jamba-large Jamba Large 256K 4K $2.0000 $8.0000 - $3.5000 - 2025-07-01 -
ai21/jamba-large-1.7-2025-07 Jamba Large 1.7 (2025-07) 256K 4K $2.0000 $8.0000 - $3.5000 - 2025-07-01 -
openai/gpt-4.1 GPT-4.1 1.0M 33K $2.0000 $8.0000 - $3.5000 - 2025-04-01 -
openai/gpt-4.5-preview GPT-4.5 Preview 128K 16K $2.0000 $8.0000 - $3.5000 - 2025-02-01 -
openai/o3 o3 200K 100K $2.0000 $8.0000 - $3.5000 reasoning, no-system 2025-04-01 -
google/models/gemini-2.5-pro Gemini 2.5 Pro 1.0M 66K $1.2500 $10.0000 - $3.4375 reasoning, vision, audio, video - -
openai/gpt-5 GPT-5 400K 16K $1.2500 $10.0000 - $3.4375 - 2025-06-01 -
openai/gpt-5-chat-latest GPT-5 Chat 400K 16K $1.2500 $10.0000 - $3.4375 - 2025-06-01 -
openai/gpt-5.1 GPT-5.1 400K 16K $1.2500 $10.0000 - $3.4375 - 2025-09-01 -
openai/gpt-3.5-turbo-16k GPT-3.5 Turbo 16k 16K 4K $3.0000 $4.0000 - $3.2500 - 2023-06-13 2024-06-13
openai/gpt-3.5-turbo-16k-0613 GPT-3.5 Turbo 16k 0613 16K 4K $3.0000 $4.0000 - $3.2500 - 2023-06-13 2024-06-13
cursor/composer-2-fast Composer 2 Fast 262K - $1.5000 $7.5000 - $3.0000 reasoning 2026-03-19 -
mistral/mistral-large-3 Mistral Large 3 128K 16K $2.0000 $6.0000 - $3.0000 vision 2025-01-22 -
mistral/mistral-medium-3.5 Mistral Medium 3.5 128K 16K $2.0000 $6.0000 - $3.0000 reasoning, vision 2026-04-01 -
openai/gpt-5-image-mini GPT-5 Image Mini 400K 16K $2.5000 $2.0000 - $2.3750 vision, image-gen 2025-08-01 -
cloudflare/@cf/zai-org/glm-5.2 GLM-5.2 262K - $1.4000 $4.4000 $0.2600 $2.1500 reasoning - -
cloudflare/@cf/meta/llama-2-7b-chat-fp16 Llama 2 7B Chat FP16 4K - $0.5600 $6.6700 - $2.0875 - - 2026-05-30
cloudflare/@cf/meta-llama/llama-2-7b-chat-hf-lora Llama 2 7B Chat HF LoRA - - $0.5560 $6.6670 - $2.0837 - - -
anthropic/claude-haiku-4-5 Claude Haiku 4.5 200K 64K $1.0000 $5.0000 - $2.0000 reasoning, vision 2025-10-01 -
anthropic/claude-haiku-4-5-20251001 Claude Haiku 4.5 200K 64K $1.0000 $5.0000 - $2.0000 reasoning, vision 2025-10-01 -
openai/text-davinci-002 text-davinci-002 4K 4K $2.0000 $2.0000 - $2.0000 - 2022-06-15 2024-01-04
openai/text-davinci-003 text-davinci-003 4K 4K $2.0000 $2.0000 - $2.0000 - 2022-11-30 2024-01-04
openai/o1-mini o1 Mini 128K 4K $1.1000 $4.4000 - $1.9250 reasoning, no-system 2024-09-12 -
openai/o3-mini o3 Mini 200K 100K $1.1000 $4.4000 - $1.9250 reasoning, no-system 2025-01-31 -
openai/o4-mini o4 Mini 200K 100K $1.1000 $4.4000 - $1.9250 reasoning, no-system 2025-04-01 -
cloudflare/@cf/moonshotai/kimi-k2.6 Kimi K2.6 262K - $0.9500 $4.0000 $0.1600 $1.7125 reasoning, vision - -
cloudflare/@cf/moonshotai/kimi-k2.7-code Kimi K2.7 Code 262K - $0.9500 $4.0000 - $1.7125 reasoning, vision - -
openai/gpt-5.4-mini GPT-5.4 mini 400K 128K $0.7500 $4.5000 $0.0750 $1.6875 reasoning, vision 2026-03-17 -
openai/gpt-5.4-mini-2026-03-17 GPT-5.4 mini (2026-03-17) 400K 128K $0.7500 $4.5000 $0.0750 $1.6875 reasoning, vision 2026-03-17 -
openai/gpt-3.5-turbo-0613 GPT-3.5 Turbo 0613 4K 4K $1.5000 $2.0000 - $1.6250 - 2023-06-13 2024-06-13
openai/gpt-3.5-turbo-instruct GPT-3.5 Turbo Instruct 4K 4K $1.5000 $2.0000 - $1.6250 - 2023-03-01 -
cloudflare/@cf/deepseek-ai/deepseek-r1-distill-qwen-32b DeepSeek R1 Distill Qwen 32B - - $0.5000 $4.8800 - $1.5950 reasoning - -
openrouter/x-ai/grok-4.3 Grok 4.3 (via OpenRouter) 1.0M 16K $1.2500 $2.5000 - $1.5625 reasoning, vision 2026-04-01 -
xai/grok-4.20-0309-non-reasoning Grok 4.20 1.0M 16K $1.2500 $2.5000 - $1.5625 vision 2026-03-09 -
xai/grok-4.20-0309-reasoning Grok 4.20 Reasoning 1.0M 16K $1.2500 $2.5000 - $1.5625 reasoning, vision 2026-03-09 -
xai/grok-4.20-multi-agent-0309 Grok 4.20 Multi-Agent 1.0M 16K $1.2500 $2.5000 - $1.5625 reasoning, vision 2026-03-09 -
xai/grok-4.3 Grok 4.3 1.0M 16K $1.2500 $2.5000 - $1.5625 reasoning, vision 2026-04-01 -
cohere/command-a-plus-05-2026 Command A+ (05-2026) 128K 8K $1.0000 $3.0000 - $1.5000 reasoning, vision 2026-05-01 -
cohere/command-a-vision Command A Vision 8K 2K $1.0000 $3.0000 - $1.5000 vision 2025-01-15 -
cloudflare/@hf/thebloke/llama-2-13b-chat-awq Llama 2 13B Chat AWQ - - $0.4000 $4.0000 - $1.3000 - - 2025-10-01
openai/gpt-3.5-turbo-1106 GPT-3.5 Turbo 1106 16K 4K $1.0000 $2.0000 - $1.2500 - 2023-11-06 -
openrouter/x-ai/grok-build-0.1 Grok Build 0.1 (via OpenRouter) 256K 16K $1.0000 $2.0000 - $1.2500 reasoning, vision 2026-05-14 -
xai/grok-build-0.1 Grok Build 0.1 256K 16K $1.0000 $2.0000 $0.2000 $1.2500 reasoning, vision 2026-05-14 -
xai/grok-code-fast-1 Grok Code Fast 1 256K 16K $1.0000 $2.0000 $0.2000 $1.2500 reasoning, vision - -
cloudflare/@cf/moonshotai/kimi-k2.5 Kimi K2.5 262K - $0.6000 $3.0000 $0.1000 $1.2000 reasoning, vision - -
groq/qwen/qwen3.6-27b Qwen 3.6 27B 131K 33K $0.6000 $3.0000 - $1.2000 - 2025-06-01 -
google/models/gemini-3-flash Gemini 3 Flash 1.0M 66K $0.5000 $3.0000 - $1.1250 vision, audio, video 2026-01-01 -
google/models/gemini-3-flash-preview Gemini 3 Flash Preview 1.0M 66K $0.5000 $3.0000 - $1.1250 vision, audio, video 2026-01-01 -
google/models/gemini-3.5-flash Gemini 3.5 Flash 1.0M 66K $0.5000 $3.0000 - $1.1250 reasoning, vision, audio, video 2026-06-01 -
openrouter/google/gemini-2-flash-exp Gemini 2 Flash Experimental (via OpenRouter) 1.0M 8K $0.5000 $3.0000 - $1.1250 vision 2024-12-11 -
cloudflare/@cf/meta/llama-2-7b-chat-int8 Llama 2 7B Chat Int8 4K - $0.2800 $3.3000 - $1.0350 - - 2026-05-30
cursor/composer-2 Composer 2 262K - $0.5000 $2.5000 $0.2000 $1.0000 reasoning 2026-03-19 -
cursor/composer-2.5 Composer 2.5 262K - $0.5000 $2.5000 $0.2000 $1.0000 reasoning 2026-05-18 -
moonshot/kimi-k2.7-code Kimi K2.7 Code 262K 8K $0.5000 $2.5000 - $1.0000 - 2026-06-01 -
deepseek/deepseek-reasoner DeepSeek Reasoner 128K 8K $0.5500 $2.2000 - $0.9625 reasoning 2025-01-20 -
moonshot/kimi-k2.5 Kimi K2.5 262K 8K $0.4500 $2.2500 - $0.9000 reasoning, vision 2026-01-01 -
moonshot/kimi-k2.6 Kimi K2.6 262K 8K $0.4500 $2.2500 - $0.9000 reasoning, vision 2026-04-21 -
openrouter/moonshotai/kimi-k2.5 Kimi K2.5 (via OpenRouter) 262K 8K $0.4500 $2.2500 - $0.9000 reasoning, vision 2025-06-01 -
openrouter/moonshotai/kimi-k2.6 Kimi K2.6 (via OpenRouter) 262K 8K $0.4500 $2.2500 - $0.9000 reasoning, vision 2026-04-21 -
google/models/gemini-2.5-flash Gemini 2.5 Flash 1.0M 66K $0.3000 $2.5000 - $0.8500 reasoning, vision, audio, video - -
google/models/gemini-2.5-flash-image Gemini 2.5 Flash Image 66K 33K $0.3000 $2.5000 - $0.8500 vision, image-gen - -
cloudflare/@cf/meta/llama-3.3-70b-instruct-fp8-fast Llama 3.3 70B Instruct FP8 Fast - - $0.2900 $2.2500 - $0.7800 - - -
cloudflare/@cf/nvidia/nemotron-3-120b-a12b Nemotron 3 120B - - $0.5000 $1.5000 - $0.7500 reasoning - -
cohere/command-a Command A 8K 2K $0.5000 $1.5000 - $0.7500 vision 2025-01-15 -
cohere/command-a-reasoning Command A Reasoning 8K 2K $0.5000 $1.5000 - $0.7500 reasoning 2025-01-15 -
openai/gpt-3.5-turbo GPT-3.5 Turbo 16K 4K $0.5000 $1.5000 - $0.7500 - 2023-03-01 -
openai/gpt-3.5-turbo-0125 GPT-3.5 Turbo 0125 16K 4K $0.5000 $1.5000 - $0.7500 - 2024-01-25 -
cloudflare/@cf/qwen/qwen2.5-coder-32b-instruct Qwen2.5 Coder 32B Instruct - - $0.6600 $1.0000 - $0.7450 - - -
cloudflare/@cf/qwen/qwq-32b QwQ 32B - - $0.6600 $1.0000 - $0.7450 reasoning - -
openai/gpt-4.1-mini GPT-4.1 Mini 1.0M 33K $0.4000 $1.6000 - $0.7000 - 2025-04-01 -
openai/gpt-5-mini GPT-5 Mini 400K 16K $0.2500 $2.0000 - $0.6875 - 2025-06-01 -
openai/gpt-5.1-codex-mini GPT-5.1 Codex Mini 400K 16K $0.2500 $2.0000 - $0.6875 - 2025-09-01 -
groq/llama-3.3-70b-versatile Llama 3.3 70B Versatile 131K 33K $0.5900 $0.7900 - $0.6400 - 2024-12-06 2026-08-16
openrouter/meta-llama/llama-3.1-70b Llama 3.1 70B (via OpenRouter) 131K 8K $0.5900 $0.7900 - $0.6400 - 2024-07-23 -
cohere/command-r-plus Command R+ 128K 8K $0.3000 $1.5000 - $0.6000 vision 2024-08-01 -
google/models/gemini-3.1-flash-image-preview Gemini 3.1 Flash Image Preview 131K 33K $0.2500 $1.5000 - $0.5625 vision, image-gen 2026-02-09 -
google/models/gemini-3.1-flash-lite-preview Gemini 3.1 Flash-Lite Preview 1.0M 66K $0.2500 $1.5000 - $0.5625 reasoning, vision, audio, video 2026-03-03 2026-05-25
deepseek/deepseek-v4-pro DeepSeek V4 Pro 1.0M 393K $0.4350 $0.8700 $0.0036 $0.5437 reasoning 2026-04-24 -
anthropic/claude-3-haiku-20240307 Claude Haiku 3 200K 4K $0.2500 $1.2500 - $0.5000 - 2024-03-07 -
cloudflare/@cf/openai/whisper Whisper - - $0.5000 $0.0000 - $0.5000 audio, no-chat - -
cloudflare/@cf/openai/whisper-large-v3-turbo Whisper Large v3 Turbo - - $0.5000 $0.0000 - $0.5000 audio, no-chat - -
deepseek/deepseek-chat DeepSeek Chat 128K 8K $0.2700 $1.1000 - $0.4775 - 2025-01-20 -
openrouter/deepseek/deepseek-chat DeepSeek Chat (via OpenRouter) 64K 8K $0.2700 $1.1000 - $0.4775 - 2025-01-20 -
openai/gpt-5.4-nano GPT-5.4 nano 400K 128K $0.2000 $1.2500 $0.0200 $0.4625 reasoning, vision 2026-03-17 -
openai/gpt-5.4-nano-2026-03-17 GPT-5.4 nano (2026-03-17) 400K 128K $0.2000 $1.2500 $0.0200 $0.4625 reasoning, vision 2026-03-17 -
cloudflare/@cf/openai/gpt-oss-120b GPT-OSS 120B - - $0.3500 $0.7500 - $0.4500 reasoning - -
cloudflare/@cf/meta/llama-3-8b-instruct Llama 3 8B Instruct 8K - $0.2820 $0.8270 - $0.4183 - - 2026-05-30
cloudflare/@cf/meta/llama-3.1-70b-instruct Llama 3.1 70B Instruct 24K - $0.2820 $0.8270 - $0.4183 - - 2026-05-30
cloudflare/@cf/meta/llama-3.1-8b-instruct Llama 3.1 8B Instruct - - $0.2820 $0.8270 - $0.4183 - - 2026-05-30
cloudflare/@hf/meta-llama/meta-llama-3-8b-instruct Meta Llama 3 8B Instruct (HF) - - $0.2820 $0.8270 - $0.4183 - - 2026-05-30
cloudflare/@cf/meta/llama-4-scout-17b-16e-instruct Llama 4 Scout 17B - - $0.2700 $0.8500 - $0.4150 vision - -
cloudflare/@cf/aisingapore/gemma-sea-lion-v4-27b-it SEA-LION v4 27B IT - - $0.3500 $0.5600 - $0.4025 - - -
cloudflare/@cf/google/gemma-3-12b-it Gemma 3 12B IT 128K - $0.3500 $0.5600 - $0.4025 - - 2026-05-30
cloudflare/@cf/mistralai/mistral-small-3.1-24b-instruct Mistral Small 3.1 24B Instruct 128K - $0.3500 $0.5600 - $0.4025 - - -
inception/mercury-2 Mercury 2 128K - $0.2500 $0.7500 $0.0250 $0.3750 reasoning - -
mistral/open-mistral-7b Open Mistral 7B 32K 8K $0.2500 $0.7500 - $0.3750 - 2024-01-01 -
cloudflare/@cf/meta/llama-guard-3-8b Llama Guard 3 8B - - $0.4800 $0.0300 - $0.3675 - - -
groq/qwen/qwen3-32b Qwen 3 32B 131K 41K $0.2900 $0.5900 - $0.3650 - 2024-12-11 2026-07-17
xai/grok-3-mini Grok 3 Mini 131K 8K $0.3000 $0.5000 - $0.3500 reasoning - -
cloudflare/@cf/ai4bharat/indictrans2-en-indic-1B IndicTrans2 EN→Indic 1B - - $0.3400 $0.3400 - $0.3400 no-chat - -
cloudflare/@cf/meta/m2m100-1.2b M2M100 1.2B - - $0.3400 $0.3400 - $0.3400 no-chat - -
cloudflare/@cf/pipecat-ai/smart-turn-v2 Smart Turn v2 - - $0.3380 $0.0000 - $0.3380 audio, no-chat - -
mistral/devstral-2 Devstral 2 128K 8K $0.2000 $0.6000 - $0.3000 - 2024-12-19 -
mistral/mistral-3-14b Mistral 3 14B 32K 8K $0.2000 $0.6000 - $0.3000 - 2025-01-01 -
mistral/mistral-small-4 Mistral Small 4 128K 16K $0.2000 $0.6000 - $0.3000 reasoning 2026-03-01 -
xai/grok-4-1-fast-non-reasoning Grok 4.1 Fast 2.0M 16K $0.2000 $0.5000 - $0.2750 vision - 2026-05-15
xai/grok-4-1-fast-reasoning Grok 4.1 Fast Reasoning 2.0M 16K $0.2000 $0.5000 - $0.2750 reasoning, vision - 2026-05-15
xai/grok-4-fast-non-reasoning Grok 4 Fast 2.0M 16K $0.2000 $0.5000 - $0.2750 vision - 2026-05-15
xai/grok-4-fast-reasoning Grok 4 Fast Reasoning 2.0M 16K $0.2000 $0.5000 - $0.2750 reasoning, vision - 2026-05-15
groq/compound Compound 128K 8K $0.1500 $0.6000 - $0.2625 - 2025-01-01 -
groq/openai/gpt-oss-120b GPT-OSS 120B 131K 66K $0.1500 $0.6000 - $0.2625 - 2025-01-01 -
openai/gpt-4o-mini GPT-4o Mini 128K 4K $0.1500 $0.6000 - $0.2625 vision 2024-07-18 -
openai/gpt-4o-mini-audio-preview GPT-4o Mini Audio Preview 128K 4K $0.1500 $0.6000 - $0.2625 audio 2024-12-17 -
openai/gpt-4o-mini-realtime-preview GPT-4o Mini Realtime Preview 128K 4K $0.1500 $0.6000 - $0.2625 audio, no-chat 2024-12-17 -
openrouter/auto Auto (Fallback) 8K 2K $0.1500 $0.6000 - $0.2625 - 2023-01-01 -
cloudflare/@cf/black-forest-labs/flux-2-dev FLUX.2 Dev - - $0.2100 $0.4100 - $0.2600 image-gen, no-chat - -
ai21/jamba-1.5-mini Jamba 1.5 Mini 256K 4K $0.2000 $0.4000 - $0.2500 - 2024-08-19 2025-05-06
ai21/jamba-mini Jamba Mini 256K 4K $0.2000 $0.4000 - $0.2500 - 2026-01-01 -
ai21/jamba-mini-2-2026-01 Jamba Mini 2 (2026-01) 256K 4K $0.2000 $0.4000 - $0.2500 - 2026-01-01 -
ai21/jamba2-mini Jamba2 Mini (legacy name) 256K 4K $0.2000 $0.4000 - $0.2500 - 2025-01-28 2026-02-01
cloudflare/@cf/openai/gpt-oss-20b GPT-OSS 20B - - $0.2000 $0.3000 - $0.2250 reasoning - -
mistral/mistral-3-3b Mistral 3 3B 32K 8K $0.1400 $0.4200 - $0.2100 - 2025-01-01 -
mistral/mistral-3-8b Mistral 3 8B 32K 8K $0.1400 $0.4200 - $0.2100 - 2025-01-01 -
cloudflare/@cf/meta/llama-3.2-11b-vision-instruct Llama 3.2 11B Vision Instruct - - $0.0490 $0.6800 - $0.2068 vision - -
cloudflare/@cf/baai/bge-large-en-v1.5 BGE Large EN v1.5 - - $0.2000 $0.0000 - $0.2000 no-chat - -
cloudflare/@cf/myshell-ai/melotts MeloTTS - - $0.2000 $0.0000 - $0.2000 audio, no-chat - -
cloudflare/@cf/openai/whisper-tiny-en Whisper Tiny EN - - $0.2000 $0.0000 - $0.2000 audio, no-chat - -
cloudflare/@cf/meta/llama-3.1-8b-instruct-fp8 Llama 3.1 8B Instruct FP8 - - $0.1500 $0.2900 - $0.1850 - - -
cloudflare/@cf/qwen/qwen1.5-14b-chat-awq Qwen1.5 14B Chat AWQ - - $0.1500 $0.2800 - $0.1825 - - 2025-10-01
google/models/gemini-2.0-flash Gemini 2.0 Flash 1.0M 8K $0.1000 $0.4000 - $0.1750 vision - 2026-06-01
google/models/gemini-2.5-flash-lite Gemini 2.5 Flash-Lite 1.0M 66K $0.1000 $0.4000 - $0.1750 vision, audio, video 2026-02-05 -
openai/gpt-4.1-nano GPT-4.1 Nano 1.0M 33K $0.1000 $0.4000 - $0.1750 - 2025-04-01 -
deepseek/deepseek-v4-flash DeepSeek V4 Flash 1.0M 393K $0.1400 $0.2800 $0.0028 $0.1750 reasoning 2026-04-24 -
groq/meta-llama/llama-4-scout-17b-16e-instruct Llama 4 Scout 17B 16E 131K 8K $0.1100 $0.3400 - $0.1675 - 2025-01-28 2026-07-17
openrouter/meta-llama/llama-3.2-11b-vision-instruct Llama 3.2 11B Vision Instruct (via OpenRouter) 128K 8K $0.1600 $0.1600 - $0.1600 vision 2024-09-25 -
cloudflare/@cf/meta/llama-3-8b-instruct-awq Llama 3 8B Instruct AWQ - - $0.1200 $0.2700 - $0.1575 - - 2026-05-30
cloudflare/@cf/meta/llama-3.1-8b-instruct-awq Llama 3.1 8B Instruct AWQ - - $0.1200 $0.2700 - $0.1575 - - 2026-05-30
cloudflare/@cf/google/gemma-4-26b-a4b-it Gemma 4 26B A4B IT 256K - $0.1000 $0.3000 - $0.1500 reasoning, vision - -
mistral/devstral-small-2 Devstral Small 2 32K 8K $0.1000 $0.3000 - $0.1500 - 2024-12-19 -
cloudflare/@cf/zai-org/glm-4.7-flash GLM-4.7 Flash 131K - $0.0600 $0.4000 - $0.1450 reasoning - -
openai/gpt-5-nano GPT-5 Nano 400K 16K $0.0500 $0.4000 - $0.1375 - 2025-06-01 -
google/models/gemini-2.0-flash-lite Gemini 2.0 Flash-Lite 1.0M 8K $0.0750 $0.3000 - $0.1312 vision - 2026-06-01
groq/compound-mini Compound Mini 128K 8K $0.0750 $0.3000 - $0.1312 - 2025-01-01 -
groq/openai/gpt-oss-20b GPT-OSS 20B 131K 66K $0.0750 $0.3000 - $0.1312 - 2025-01-01 -
groq/openai/gpt-oss-safeguard-20b GPT-OSS Safeguard 20B 131K 66K $0.0750 $0.3000 - $0.1312 - 2025-01-01 -
cloudflare/@cf/deepseek-ai/deepseek-math-7b-instruct DeepSeek Math 7B Instruct - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@cf/defog/sqlcoder-7b-2 SQLCoder 7B v2 - - $0.1100 $0.1900 - $0.1300 - - 2026-05-30
cloudflare/@cf/fblgit/una-cybertron-7b-v2-bf16 UNA Cybertron 7B v2 BF16 - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@cf/google/gemma-7b-it-lora Gemma 7B IT LoRA - - $0.1100 $0.1900 - $0.1300 - - -
cloudflare/@cf/llava-hf/llava-1.5-7b-hf LLaVA 1.5 7B - - $0.1100 $0.1900 - $0.1300 vision, no-chat - -
cloudflare/@cf/mistral/mistral-7b-instruct-v0.1 Mistral 7B Instruct v0.1 - - $0.1100 $0.1900 - $0.1300 - - 2026-05-30
cloudflare/@cf/mistral/mistral-7b-instruct-v0.2-lora Mistral 7B Instruct v0.2 LoRA - - $0.1100 $0.1900 - $0.1300 - - -
cloudflare/@cf/openchat/openchat-3.5-0106 OpenChat 3.5 - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@cf/thebloke/discolm-german-7b-v1-awq DiscoLM German 7B v1 AWQ - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@cf/tiiuae/falcon-7b-instruct Falcon 7B Instruct - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@hf/google/gemma-7b-it Gemma 7B IT - - $0.1100 $0.1900 - $0.1300 - - 2026-05-30
cloudflare/@hf/mistral/mistral-7b-instruct-v0.2 Mistral 7B Instruct v0.2 3K - $0.1100 $0.1900 - $0.1300 - - 2026-05-30
cloudflare/@hf/nexusflow/starling-lm-7b-beta Starling LM 7B Beta - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@hf/nousresearch/hermes-2-pro-mistral-7b Hermes 2 Pro Mistral 7B - - $0.1100 $0.1900 - $0.1300 - - 2026-05-30
cloudflare/@hf/thebloke/llamaguard-7b-awq LlamaGuard 7B AWQ - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@hf/thebloke/mistral-7b-instruct-v0.1-awq Mistral 7B Instruct v0.1 AWQ - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@hf/thebloke/neural-chat-7b-v3-1-awq Neural Chat 7B v3.1 AWQ - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@hf/thebloke/openhermes-2.5-mistral-7b-awq OpenHermes 2.5 Mistral 7B AWQ - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@hf/thebloke/zephyr-7b-beta-awq Zephyr 7B Beta AWQ - - $0.1100 $0.1900 - $0.1300 - - 2025-10-01
cloudflare/@cf/meta/llama-3.1-8b-instruct-fast Llama 3.1 8B Instruct Fast - - $0.0450 $0.3840 - $0.1298 - - -
cloudflare/@cf/meta/llama-3.2-3b-instruct Llama 3.2 3B Instruct - - $0.0510 $0.3400 - $0.1232 - - -
cloudflare/@cf/qwen/qwen3-30b-a3b-fp8 Qwen3 30B A3B FP8 - - $0.0510 $0.3400 - $0.1232 reasoning - -
cloudflare/@cf/qwen/qwen1.5-7b-chat-awq Qwen1.5 7B Chat AWQ - - $0.1000 $0.1800 - $0.1200 - - 2025-10-01
cloudflare/@hf/thebloke/deepseek-coder-6.7b-base-awq DeepSeek Coder 6.7B Base AWQ - - $0.1000 $0.1800 - $0.1200 - - 2025-10-01
cloudflare/@hf/thebloke/deepseek-coder-6.7b-instruct-awq DeepSeek Coder 6.7B Instruct AWQ - - $0.1000 $0.1800 - $0.1200 - - 2025-10-01
cloudflare/@cf/black-forest-labs/flux-2-klein-4b FLUX.2 Klein 4B - - $0.0590 $0.2870 - $0.1160 image-gen, no-chat - -
groq/whisper-large-v3 Whisper Large v3 448 448 $0.1110 $0.1110 - $0.1110 - 2024-03-01 -
cloudflare/@cf/facebook/detr-resnet-50 DETR ResNet-50 - - $0.1000 $0.1000 - $0.1000 no-chat - -
google/models/gemma-3-27b-it Gemma 3 27B 131K 8K $0.0800 $0.1600 - $0.1000 vision - -
cloudflare/@cf/microsoft/phi-2 Phi-2 - - $0.0800 $0.1500 - $0.0975 - - 2026-05-30
cloudflare/@cf/facebook/bart-large-cnn BART Large CNN - - $0.0500 $0.1500 - $0.0750 no-chat - 2026-05-30
cloudflare/@cf/google/gemma-2b-it-lora Gemma 2B IT LoRA - - $0.0600 $0.1200 - $0.0750 - - -
cloudflare/@cf/meta/llama-3.2-1b-instruct Llama 3.2 1B Instruct - - $0.0270 $0.2000 - $0.0703 - - -
cloudflare/@cf/baai/bge-base-en-v1.5 BGE Base EN v1.5 - - $0.0670 $0.0000 - $0.0670 no-chat - -
cohere/command-r7b Command R 7B 128K 8K $0.0375 $0.1500 - $0.0656 vision 2024-04-04 -
cohere/command-r7b-12-2024 Command R 7B (12-2024) 128K 8K $0.0375 $0.1500 - $0.0656 vision 2024-12-12 -
cloudflare/@cf/unum/uform-gen2-qwen-500m UForm Gen2 Qwen 500M - - $0.0500 $0.1000 - $0.0625 vision, no-chat - 2026-05-30
groq/llama-3.1-8b-instant Llama 3.1 8B Instant 131K 131K $0.0500 $0.0800 - $0.0575 - 2024-07-23 2026-08-16
cloudflare/@cf/tinyllama/tinyllama-1.1b-chat-v1.0 TinyLlama 1.1B Chat - - $0.0300 $0.1300 - $0.0550 - - 2025-10-01
cloudflare/@cf/qwen/qwen1.5-1.8b-chat Qwen1.5 1.8B Chat - - $0.0250 $0.1200 - $0.0488 - - 2025-10-01
google/models/gemma-3-12b-it Gemma 3 12B 131K 8K $0.0350 $0.0700 - $0.0437 vision - -
cloudflare/@cf/ibm-granite/granite-4.0-h-micro Granite 4.0 H Micro - - $0.0170 $0.1100 - $0.0403 - - -
groq/meta-llama/llama-prompt-guard-2-86m Llama Prompt Guard 2 86M 512 512 $0.0400 $0.0400 - $0.0400 - 2025-01-28 -
groq/whisper-large-v3-turbo Whisper Large v3 Turbo 448 448 $0.0400 $0.0400 - $0.0400 - 2024-11-01 -
cloudflare/@cf/stabilityai/stable-diffusion-xl-base-1.0 Stable Diffusion XL Base 1.0 - - $0.0010 $0.1500 - $0.0382 image-gen, no-chat - -
cloudflare/@cf/qwen/qwen1.5-0.5b-chat Qwen1.5 0.5B Chat - - $0.0200 $0.0800 - $0.0350 - - 2025-10-01
cloudflare/@cf/bytedance/stable-diffusion-xl-lightning Stable Diffusion XL Lightning - - $0.0010 $0.1200 - $0.0308 image-gen, no-chat - -
groq/meta-llama/llama-prompt-guard-2-22m Llama Prompt Guard 2 22M 512 512 $0.0300 $0.0300 - $0.0300 - 2025-01-28 -
cloudflare/@cf/huggingface/distilbert-sst-2-int8 DistilBERT SST-2 Int8 - - $0.0260 $0.0000 - $0.0260 no-chat - -
cloudflare/@cf/lykon/dreamshaper-8-lcm DreamShaper 8 LCM - - $0.0010 $0.1000 - $0.0258 image-gen, no-chat - -
cloudflare/@cf/runwayml/stable-diffusion-v1-5-img2img Stable Diffusion v1.5 img2img - - $0.0010 $0.1000 - $0.0258 image-gen, no-chat - -
cloudflare/@cf/runwayml/stable-diffusion-v1-5-inpainting Stable Diffusion v1.5 Inpainting - - $0.0010 $0.1000 - $0.0258 image-gen, no-chat - -
cloudflare/@cf/baai/bge-small-en-v1.5 BGE Small EN v1.5 - - $0.0200 $0.0000 - $0.0200 no-chat - -
cloudflare/@cf/google/embeddinggemma-300m EmbeddingGemma 300M - - $0.0200 $0.0000 - $0.0200 no-chat - -
cloudflare/@cf/pfnet/plamo-embedding-1b PLaMo Embedding 1B - - $0.0190 $0.0000 - $0.0190 no-chat - -
google/models/gemma-3-4b-it Gemma 3 4B 131K 8K $0.0120 $0.0240 - $0.0150 vision - -
cloudflare/@cf/black-forest-labs/flux-1-schnell FLUX.1 Schnell - - $0.0010 $0.0528 - $0.0140 image-gen, no-chat - -
cloudflare/@cf/baai/bge-m3 BGE M3 - - $0.0120 $0.0000 - $0.0120 no-chat - -
cloudflare/@cf/qwen/qwen3-embedding-0.6b Qwen3 Embedding 0.6B - - $0.0120 $0.0000 - $0.0120 no-chat - -
google/models/gemma-3-1b-it Gemma 3 1B 33K 8K $0.0040 $0.0080 - $0.0050 - - -
cloudflare/@cf/baai/bge-reranker-base BGE Reranker Base - - $0.0031 $0.0000 - $0.0031 no-chat - -
cloudflare/@cf/microsoft/resnet-50 ResNet-50 - - $0.0025 $0.0000 - $0.0025 no-chat - -
google/models/gemma-3n-e2b-it Gemma 3n E2B 33K 8K $0.0020 $0.0040 - $0.0025 - - -
google/models/gemma-3n-e4b-it Gemma 3n E4B 33K 8K $0.0020 $0.0040 - $0.0025 - - -
claude-code/opus Claude Code (Opus) 200K - - - - - reasoning - -
claude-code/sonnet Claude Code (Sonnet) 200K - - - - - reasoning - -
claude-code/haiku Claude Code (Haiku) 200K - - - - - - - -

License

MIT

Contributing

Contributions are welcome! This library is designed to be extensible - adding new providers is straightforward by implementing the IProviderAdapter interface.

Keywords