The core Node library used to integrate with Codspeed runners
vitest plugin for CodSpeed
tinybench compatibility layer for CodSpeed
NestJS Performance Benchmark Program
Detection addon benchmarking with reference images, accuracy metrics, and distributed execution
Benchmark suite for PWA
Retrieval latency ladder benchmarks + CI regression gates for @remnic/core
Evaluation framework for LLM knowledge inputs — prompts, RAG corpora, skills, agent workflows. Fix the model, vary the artifact. Built-in statistical rigor: bootstrap CI, Krippendorff α, length-debias, saturation curves.
GC aware benchmarking/profiling, with an interactive viewer. For Node and browser.
Evaluation infrastructure for the swarmkit ecosystem — (harness x model x task x arm x seed) agent evals with ground-truth scoring, cost-matched Pareto reporting, and scalable parallel execution.
Validation library benchmark suite
A CLI tool for generating charts from verbose data from belt
Cli for running benchmark tests for n8n
MCP server exposing the SharpeBench luck-robust scoring kernel as agent-callable tools (deflated Sharpe, pass^k, process discipline, briefing audit, options Greeks).
In-memory history engine for Real-Router — non-browser environments and benchmarks
Benchmark framework to collect code time metrics and excel in precision.
OpenModelMap CLI — discover Chinese open-source AI models. Query OMS scores, benchmarks, hardware requirements, and deploy commands from your terminal.
AI model router for Node.js and TypeScript with benchmark, cost, and speed-based ranking
Hlido CLI — independent, evidence-backed scorecards for AI agents. Inline scorecard, search, compare, and tier rankings, fetched live from hlido.eu.
Runtime-agnostic and statistically-aware benchmarking framework for AssemblyScript
This plug-in is used for function execution performance statistics. It calculates the execution time by injecting statistical code and finds slow functions.