NetToolsKit Memory
Knowledge ingestion, local memory, RAG, HyDE, retrieval, context packages, and vault materialization for NetToolsKit.
Introduction
nettoolskit-memory is the dedicated NetToolsKit repository for memory and
knowledge workflows that are currently being extracted from the historical
nettoolskit-copilot umbrella workspace.
This repository owns reusable memory capabilities and integrates with the rest of the NetToolsKit ecosystem through explicit contracts, manifests, APIs, and generated artifacts. It does not own agent instruction policy, machine control, code generation, DevOps deployment, assurance testing, or product orchestration.
Features
- Knowledge record ingestion and normalized document contracts.
- Chunking, deterministic embedding contracts, injectable local embedding provider boundary, provider-scoped vector retrieval, graph expansion, and bounded context package assembly.
- Memory-owned LocalFirst SQLite vector functions for deterministic semantic query readiness without a native extension dependency in the default build.
- Typed HyDE generator contract with deterministic default generation, stable generator id/version provenance, orchestrator-supplied hypothesis intake, cache-key isolation, and local-first query expansion.
- Request-bounded reusable Memory OS recall for validated or published semantic/procedural records inside context packages and query answers.
- Repository-local operational memory for sessions, events, artifacts, and replayable recall.
- Typed local operational memory APIs for sessions, runs, tool calls, artifacts, validations, route decisions, and selected memory-record pruning.
- Durable CAG/cache contracts, SQLite persistence, and automatic LocalFirst reuse for retrieval results, context packages, rerank results, and groundedness-linked verified answers across process boundaries.
- Optional feature-gated shared PostgreSQL profile with pgvector-backed vector search, PostgreSQL-backed graph expansion, persistent CAG/cache reuse, reusable Memory OS records, and operational-memory persistence for enterprise deployments.
- Vault materialization for grouped Markdown knowledge records and generated vault manifests.
- Explicit memory-engine facade for downstream command adapters and product runtimes.
- Contract-aligned
nettoolskit.manifest.jsonfor service discovery by NetToolsKit orchestrators and control surfaces. - Local
ntk-memoryCLI commands for manifest rendering, workspace materialization, vault materialization, and bounded knowledge queries. - Optional caller-gated verified answer cache reuse for
ntk-memory queryrequest files that provide a verification token and pass deterministic groundedness validation. - Dedicated
ntk-memory context-packagecommand for RAG/CAG/HyDE context assembly without answer text. - Hybrid Search V2 contracts for deterministic BM25-style lexical scoring, normalized vector plus lexical fusion, field-match explainability, and bounded context-package candidate windows.
- Deterministic reranker contracts with score-component explainability, request-activated execution, and typed rerank-result cache keys.
- Contextual compression contracts that retain source-backed spans, citations, source text hashes, matched query terms, and compression metrics without generating answer text.
- Deterministic groundedness validation for caller-provided answer text against retained context-package evidence, with supported terms, unsupported terms, evidence matches, and machine-readable reason codes.
- Content-free RAG quality observability contracts for stage latency, counts, cache hit/miss, scores, compression ratio, groundedness score, and bounded reason codes without raw prompts, answers, snippets, or evidence text.
- Sanitized DET context evidence export contract, golden example, fixture, and
runbook for local effectiveness evidence, including DET
use_det,hold, andblockedpackage decisions plus sanitized metrics host/provider binding without raw prompts, secrets, provider payloads, local paths, or unredacted user content. - Local
ntk-memoryCLI commands for operational memory write, read, and prune operations. - Profile-aware
ntk-memory memory-record-*CLI commands for reusable Memory OS record write, read, promotion, and prune operations. - Deterministic
ntk-memory doctorreadiness checks for LocalFirst and shared PostgreSQL stores, persistent CAG/cache table and indexes, round-trip cache writes, query lanes, and enterprise memory-lane selection. - Experimental Rust crate implementation extracted from the historical
nettoolskit-copilotknowledge foundation.
Contents
- Introduction
- Features
- Architecture
- Responsibility Boundary
- Source Extraction Map
- Storage Model
- Persistent CAG Cache API
- Local Operational Memory API
- Control Plane Model
- Rust Crate
- Local CLI
- Compatibility and Support
- Operations
- Planning
- Governance and Security
- Build and Tests
- Contributing
- Dependencies
- References
- License
Architecture
flowchart TD
agent[nettoolskit-agent]
memory[nettoolskit-memory]
copilot[nettoolskit-copilot]
control[nettoolskit-control]
rust[nettoolskit-rust]
docs[docs/knowledge-base]
store[.build/knowledge/store]
local[.temp/context-memory]
vault[.deployment/artifacts/knowledge-vault]
agent -->|policy and validation rules| copilot
copilot -->|runtime decisions| memory
copilot --> control
memory --> rust
memory --> docs
memory --> store
memory --> local
memory --> vault
nettoolskit-memory is the memory execution boundary. It consumes policy and
contracts from other repositories but keeps ingestion, retrieval, persistence,
and context-package behavior behind memory-owned APIs.
The current Rust implementation is intentionally extracted without deleting the
historical nettoolskit-copilot sources. A later compatibility-adapter PR will
decide when nettoolskit-copilot starts consuming this repository directly.
Responsibility Boundary
| Repository | Owns |
|---|---|
nettoolskit-agent |
Instructions, prompts, governance, validation, model-routing policy, context-economy policy, and memory/RAG/HyDE usage rules. |
nettoolskit-memory |
Knowledge ingestion, chunking, embeddings, retrieval, graph expansion, HyDE, context packages, local memory, and vault materialization. |
nettoolskit-copilot |
Runtime orchestration, prompt middleware, model selection, approval coordination, and product workflow composition. |
nettoolskit-control |
Machine workers, leases, command execution, approvals, and evidence collection. |
nettoolskit-codegen |
Code generation, scaffolding, refactor planning, and architecture validation. |
nettoolskit-assurance |
Security, performance, benchmark, OpenAPI, frontend, and quality gate validation. |
nettoolskit-devops |
Docker, images, deployment, backup, observability, proxy, and operations. |
nettoolskit-rust |
Generic Rust core, contracts, CLI, observability, adapters, validation, and specification foundations. |
Source Extraction Map
Initial source material comes from nettoolskit-copilot:
| Current source | Target ownership |
|---|---|
crates/knowledge |
Move to nettoolskit-memory as the primary knowledge and retrieval crate. |
crates/memory/engine |
Replace with nettoolskit_memory::engine as the memory facade. |
crates/control/runtime/src/continuity/local_context.rs |
Split memory-owned persistence/retrieval from runtime command orchestration. |
crates/control/runtime/src/maintenance/prune_local_memory.rs |
Move local memory retention logic to nettoolskit-memory; keep runtime command wiring in the caller. |
docs/knowledge-base/** |
Keep durable authored knowledge records versioned in the owning product repository; memory provides validation and processing APIs. |
Do not delete or deprecate the existing nettoolskit-copilot sources until the
new memory crate has mirrored tests, a compatibility adapter, and a confirmed
consumer integration path.
Storage Model
Generated stores are rebuildable runtime artifacts and must not be committed.
| Store | Purpose |
|---|---|
.build/knowledge/store/knowledge.db |
Rebuildable knowledge store for ingestion, vector retrieval, durable graph state, persistent CAG/cache entries, and materialized workspace state. |
| Shared PostgreSQL schema | External shared store selected only through SharedPostgresqlSettings and a connection-string environment variable. |
.temp/context-memory/context.db |
Rebuildable local operational memory for sessions, events, artifacts, and recall. |
.deployment/artifacts/knowledge-vault/** |
Generated vault and validation artifacts for review or publication. |
docs/knowledge-base/** |
Durable authored knowledge records owned by the product or domain repository. |
The default Rust crate writes local operational-memory, graph, and persistent
CAG/cache API data to the repository-local SQLite schema under
.build/knowledge/store/knowledge.db. The optional shared-postgresql Cargo
feature adds SharedPostgresqlSettings, PostgresVectorRepository,
PostgresGraphRepository, PostgresPersistentCacheBackend,
PostgresMemoryRecordBackend, and PostgresOperationalMemoryBackend for
explicit shared deployments. Shared PostgreSQL configuration names the
environment variable that contains the connection string; the value itself must
never be committed, logged, or emitted in artifacts. Shared operational memory
for sessions, runs, tool calls, artifacts, validations, and route decisions uses
the same persistence_profile/shared_postgresql request shape in the
contract pack. Legacy
.temp/context-memory/context.db remains listed as extraction source material
until downstream runtime adapters move to the memory-owned API.
Persistent CAG Cache API
The persistent cache API is the durable boundary for reusable CAG state. It stores typed cache entries as JSON payloads with deterministic keys, stable payload hashes, creation timestamps, optional expiration timestamps, and schema versions.
The API exposes:
| Contract | Purpose |
|---|---|
PersistentCacheEntry |
Validated durable cache row for retrieval results, context packages, or verified idempotent results. |
PersistentCachePolicy |
Materialized-workspace policy that enables/disables durable cache kinds and sets per-kind TTLs. |
PersistentCacheReadRequest |
Reads one cache entry and treats expired rows as misses unless include_expired is explicit. |
PersistentCachePruneRequest |
Selects entries by kind, expiration, schema version, or payload hash and supports dry-run mode. |
persist_persistent_cache_entry |
Upserts one validated cache entry into the repository-local SQLite store. |
read_persistent_cache_entry |
Reads one cache entry through the TTL-aware typed read path. |
prune_persistent_cache_entries |
Deletes or previews selected cache rows without exposing a shell-like deletion surface. |
probe_persistent_cache_readiness |
Validates the persistent-cache table, required indexes, TTL-aware reads, and doctor-owned write/read/prune round-trip. |
This API does not decide which model, prompt, or orchestration lane should use
a cached value. That decision remains in nettoolskit-copilot; this repository
owns the deterministic storage and validation surface. Within
nettoolskit-memory, materialized workspaces automatically read and write
durable cache entries when the selected repository resolver provides a
PersistentCacheBackend. LocalFirst attaches the SQLite backend by default;
the feature-gated shared PostgreSQL profile attaches
PostgresPersistentCacheBackend when explicit safe settings are provided. The
same resolver selects the reusable Memory OS backend, so materialization can
publish and recall reusable memory records through SQLite or shared PostgreSQL
without changing caller code. Both profiles use a workspace content fingerprint
to avoid stale reuse after source changes.
Automatic LocalFirst cache writes use kind-specific TTLs by default:
| Cache kind | Default TTL |
|---|---|
query_embedding |
21,600 seconds |
retrieval_result |
21,600 seconds |
rerank_result |
21,600 seconds |
context_package |
86,400 seconds |
verified_idempotent_result |
604,800 seconds |
PersistentCachePolicy can disable individual durable cache kinds or set a
custom positive TTL. A ttl_seconds value of null keeps that kind enabled
without an automatic expiration timestamp. Manual ntk-memory cache-write
requests remain explicit PersistentCacheEntry payloads and can still provide
their own expires_at_unix_seconds per entry.
Retrieval and context-package cache keys include an explicit enterprise filter-scope fingerprint. Tenant, access role, document version, domain, publication date windows, content root, source path, language, document kind, and lexical shortlist filters therefore cannot reuse cache entries across different retrieval scopes. Context-package cache keys also include the HyDE generator id and version when HyDE mode is active, the contextual compression profile scope when compression is enabled, and the rerank profile plus result budget when reranking is enabled, so generator, compression, or rerank changes do not reuse stale packages from a different evidence shape.
Query embedding cache keys are narrower by design. They include provider id, provider version, vector dimension, and a query text hash, but never the raw query text. This allows the same query vector to be reused across different tenant/filter/budget scopes while retrieval-result cache entries remain scoped to the full request.
Rerank-result cache keys include reranker id, reranker version, profile id,
query text, result budget, and a fingerprint of the candidate window. The
deterministic reranker runs when KnowledgeQueryRequestInput sets
enable_rerank: true with a positive max_rerank_hits; selecting a
model-assisted reranker or fallback policy remains the responsibility of the
calling orchestrator.
Verified answer reuse is always caller-gated. The memory service stores and
reuses verified idempotent answer text only when the query request includes a
non-empty verification_token and deterministic groundedness validation accepts
the answer against the retained context package. Verified answer cache keys also
include answer_generation_scope: the safe provider/model/model version surface
that produced the answer candidate. This prevents verified answers generated by
different providers, models, or model versions from reusing the same cache row.
The durable payload is a VerifiedAnswerCacheEntry with answer hash,
context-package hash, verification-token fingerprint, answer generation scope,
and the accepted groundedness result. Provider answers produced outside memory
must be persisted through ntk-memory verified-answer-cache-write, which
rebuilds the query context and computes Memory-owned cache keys and hashes
instead of asking callers to submit low-level PersistentCacheEntry rows. The
command output contains only cache metadata and hashes; it does not echo answer
text or verification tokens. Deciding when to call providers, retry, degrade, or
expose cached answers remains in the calling orchestrator.
Provider and model identifiers in answer_generation_scope are cache scope
tokens, not configuration channels. They must be stable, non-secret identifiers
and must not contain credentials, bearer tokens, prompts, answer text,
transcripts, request headers, customer data, local filesystem paths, database
URLs, or other environment-specific secrets.
ntk-memory doctor uses a controlled persistent-cache probe to verify the
table, indexes, TTL miss behavior, and write/read/prune behavior. The probe
removes its doctor-owned row before reporting success and does not prune
runtime cache rows.
Local Operational Memory API
The local operational memory API is adapter-facing and deterministic. It writes
sessions, runs, tool calls, artifacts, validations, and route decisions through
OperationalMemoryWriteBatch. Batch writes are atomic: if one child row fails,
the full write is rolled back.
OperationalMemoryReadRequest applies session and run filters as an
intersection. Session filters constrain matching runs and child rows; run
filters constrain parent sessions and child rows. Route decisions are filtered
independently by route_kind.
Operational payloads must be sanitized before persistence. The crate rejects null payloads, artifact paths outside repository-relative form, common local path markers, and secret-like token markers. Runtime adapters must summarize large logs before calling this API.
MemoryRecordPruneRequest requires at least one explicit selection filter and
supports dry-run mode. Non-dry-run prune deletes selected memory records and
their source-memory promotion rows in one transaction.
Control Plane Model
nettoolskit-memory exposes deterministic memory capabilities through typed
contracts and command-ready APIs. It does not decide which model to use, when to
call an LLM, which worker executes a task, or whether an approval gate passes.
Runtime orchestration belongs to nettoolskit-copilot. Machine execution
belongs to nettoolskit-control. Generic CLI building blocks belong to
nettoolskit-rust.
Rust Crate
This repository uses the product-repository single-crate layout.
| Path | Package | Responsibility |
|---|---|---|
src/** |
nettoolskit-memory |
Knowledge ingestion, retrieval, context packages, local memory, and vault materialization. |
src/engine/** |
nettoolskit-memory |
Grouped public facade for downstream adapters and product runtimes. |
src/local/** |
nettoolskit-memory |
Local operational memory contracts for sessions, runs, artifacts, validations, route decisions, and prune requests. |
tests/** |
nettoolskit-memory |
Mirrored integration tests for the source modules. |
Cargo build output is configured under .build/target through
.cargo/config.toml.
The crate exposes the typed HyDE generator boundary through HydeGenerator,
DeterministicHydeGenerator, SuppliedHydeGenerator, HydeHypothesis, and
SuppliedHydeHypothesis, with generator id/version provenance retained in
context package output. Supplied HyDE is a data-only contract for external
orchestrators such as nettoolskit-copilot; memory validates the query match
and provenance fields, then uses the hypothesis for retrieval and cache scope.
The cache scope preserves the external generator id/version and adds a
hypothesis-text fingerprint so different supplied hypotheses cannot reuse the
same context package accidentally. Prompt templates, model routing,
credentials, retries, and orchestration policy remain outside this repository.
The default local orchestrator uses HashBucketEmbeddingProvider for
deterministic development and test runs. Downstream Rust consumers can inject a
custom local EmbeddingProvider through LocalKnowledgeOrchestrator while
keeping provider id/version provenance in embeddings, retrieval explanations,
and cache keys. This crate does not implement network embedding providers or
own provider secrets.
Local CLI
The repository builds a local-process binary named ntk-memory. It is the
preferred adapter boundary for product orchestrators that should call memory
capabilities without linking to memory internals.
Supported commands:
| Command | Purpose |
|---|---|
ntk-memory manifest [--output <path>] |
Render the service manifest as deterministic JSON. |
ntk-memory materialize --request <path> [--output <path>] |
Ingest and index a bounded workspace through the request-selected memory persistence profile. |
ntk-memory query --request <path> [--output <path>] |
Run vector retrieval, graph expansion, HyDE, reusable memory recall, context package assembly, profile-backed persistence, and optional caller-gated verified answer reuse. |
ntk-memory context-package --request <path> [--output <path>] |
Assemble a bounded context package with retrieval evidence, reusable memory recall, and profile-backed cache metadata without answer text. |
ntk-memory groundedness-validate --request <path> [--output <path>] |
Validate caller-provided answer text against retained context-package evidence without model routing or answer generation. |
ntk-memory verified-answer-cache-write --request <path> [--output <path>] |
Rebuild the query context, validate a provider answer, and persist verified answer cache state with Memory-owned keys and hashes. |
ntk-memory vault-materialize --request <path> [--output <path>] |
Materialize grouped Markdown knowledge vault records and a vault manifest. |
ntk-memory doctor --repo-root <path> [--request <path>] [--output <path>] |
Report LocalFirst or request-selected shared PostgreSQL runtime readiness for enterprise RAG/CAG/HyDE/context/verified-cache/operational-memory lanes. |
ntk-memory cache-write --repo-root <path> --request <path> [--output <path>] |
Persist one durable CAG/cache entry. |
ntk-memory cache-read --repo-root <path> --request <path> [--output <path>] |
Read one durable CAG/cache entry with TTL-aware miss semantics. |
ntk-memory cache-prune --repo-root <path> --request <path> [--output <path>] |
Prune selected durable CAG/cache entries with explicit filters and dry-run support. |
ntk-memory local-write --repo-root <path> --request <path> [--output <path>] |
Persist a repository-local operational memory write batch. |
ntk-memory local-read --repo-root <path> --request <path> [--output <path>] |
Read repository-local operational memory rows with deterministic filters. |
scripts/operations/write-det-local-first-use-memory.ps1 |
Write .deployment/artifacts/memory/det-local-first-use-write.json for DET runtime-wrapper readiness. |
ntk-memory local-prune --repo-root <path> --request <path> [--output <path>] |
Prune selected Memory OS records with explicit filters and dry-run support. |
ntk-memory memory-record-write --repo-root <path> --request <path> [--output <path>] |
Persist validated reusable Memory OS records through the request-selected profile. |
ntk-memory memory-record-read --repo-root <path> --request <path> [--output <path>] |
Read reusable Memory OS records and optional promotion metadata through the request-selected profile. |
ntk-memory memory-record-promote --repo-root <path> --request <path> [--output <path>] |
Persist an explicit Memory OS promotion and its target record through the request-selected profile. |
ntk-memory memory-record-prune --repo-root <path> --request <path> [--output <path>] |
Prune selected Memory OS records through the public profile-aware record adapter surface. |
Request files use the Rust API payloads KnowledgeWorkspaceRequestInput,
KnowledgeQueryRequestInput, and KnowledgeVaultBuildRequestInput. The
query command returns answer text plus the context package; context-package
returns only retrieval package data and cache metadata. vault-materialize
uses output_root from the request JSON as the Markdown vault destination;
the CLI --output flag is only the machine-readable JSON report destination.
Outputs are machine-readable JSON. When --output is omitted, the command
writes JSON to stdout.
Workspace-backed commands default to local_first. Callers can select
shared_postgresql by adding persistence_profile: "shared_postgresql" and a
secret-safe shared_postgresql settings object to the workspace request. The
settings object names the environment variable containing the connection string
and never carries the secret value itself. If the optional
shared-postgresql Cargo feature or required environment variable is missing,
the command fails closed with a sanitized profile/readiness error and does not
bootstrap the repository-local SQLite store.
KnowledgeQueryRequestInput can opt into reusable Memory OS recall with
enable_reusable_memory_recall: true and a positive
max_reusable_memory_hits budget. When enabled, local-first materialization
loads validated or published semantic and procedural records from
memory_items; recalled records appear as memory_record snippets, evidence,
and the optional reusable_memory_recall output block. The cache key includes
a reusable-memory scope fingerprint so changed memory records cannot reuse a
stale context package.
KnowledgeQueryRequestInput can also apply enterprise retrieval scope filters:
tenant_id, access_roles, document_version, domain,
min_published_at_unix_seconds, and max_published_at_unix_seconds. These
filters are optional for compatibility, but when a caller supplies them memory
fails closed: a chunk without matching metadata is excluded from lexical
prefiltering, vector retrieval, context packages, and cache reuse.
When enable_lexical_prefilter is true with a positive max_lexical_hits,
memory scores the lexical corpus with the LexicalScoringProfile contract and
applies the same enterprise filters before scoring. Context-package assembly
then uses HybridFusionProfile to normalize and fuse vector and lexical
scores, emitting vector_score, lexical_score, fusion_score, ranking
stages, matched lexical fields, and the fusion profile in retrieval
explanations. Default behavior remains semantic-only unless the request enables
lexical prefiltering.
Contextual compression is exposed as a memory-owned package assembly primitive.
When a caller supplies context_compression_profile, memory replaces package
snippets with retained source spans and emits compression.retained_spans plus
compression.metrics. The retained text is copied from source snippets only;
model routing, deciding whether compression is worth the latency/cost tradeoff,
and answer generation remain caller responsibilities.
Deterministic reranking is request-activated. When enable_rerank is true,
memory applies rerank_profile or the default deterministic profile after
semantic or hybrid retrieval, caps output with max_rerank_hits, emits the
optional context_package.rerank block, and updates retained snippets and
evidence ordering before optional compression.
Groundedness validation is exposed as GroundednessValidationRequest,
GroundednessValidationResult, and ntk-memory groundedness-validate. The
validator compares caller-provided answer text against retained context-package
snippets or compression spans, emits supported and unsupported terms plus
reason codes, and never calls an LLM. Retry, degrade, or final-answer policy
remains owned by the caller.
RAG quality observability is exposed as RagQualityReport,
RagQualityStageMetric, and the rag-quality-report contract. Reports are
intentionally content-free: they use hashes for request, tenant, and query
scope and only store stage status, latency, counts, cache hit/miss, normalized
scores, compression ratio, token/cost counters, profile ids, and bounded reason
codes. Prompt text, answer text, HyDE text, snippets, retained spans, evidence
text, workstation paths, and secret values must stay out of these reports.
query and context-package attach quality_report only when
emit_quality_report is true.
DET context evidence export is represented by det-context-evidence-export.
It is a pending contract and fixture for DET, Harness, and Analytics local
effectiveness evidence derived from existing context-package, quality-report,
groundedness, and cache metadata. The export is content-free by contract and
uses only hashes, counts, booleans, bounded reason codes, safe identifiers, and
normalized scores. It must not include prompts, query text, snippets, retained
spans, provider payloads, answer text, secrets, local paths, or unredacted user
content. The optional effectiveness.det_local_effectiveness_package object
models DET's local effectiveness bridge with a use_det, hold, or blocked
decision and sanitized metrics binding where the host is hashed and the metrics
provider is represented only by safe identifiers.
For LocalFirst workspaces, query embeddings, retrieval results, context
packages, rerank results, and verified answers can reuse durable cache rows
from previous processes. Public response status fields currently expose
cache_status.retrieval_result_cache_hit,
cache_status.context_package_cache_hit, and
cache_status.verified_idempotent_result_cache_hit may reflect durable hits
from a previous process. Verified answer reuse still requires an explicit
verification_token in the KnowledgeQueryRequestInput JSON plus a validated
groundedness-linked cache payload, so unverified or unsupported query calls
never reuse answer text across processes.
CLI query and context-package commands use the deterministic HyDE generator by
default and include HyDE generator provenance in returned context metadata when
HyDE mode is active. Request JSON can instead provide
supplied_hyde_hypothesis with query_text, hypothesis_text,
generator_id, generator_version, and provenance. Memory rejects blank
fields, rejects query mismatches, and accepts supplied HyDE only for
HyDE-assisted retrieval.
Operational-memory commands keep payload hygiene fail-closed for local paths
and secret-like values before any write reaches persistence. Their current
LocalFirst request shape remains valid; the contract pack also publishes
shared PostgreSQL examples for local-write and local-read using
persistence_profile: "shared_postgresql" plus secret-safe
shared_postgresql settings.
Persistent-cache command request files use the Rust API payloads
PersistentCacheEntry, PersistentCacheReadRequest, and
PersistentCachePruneRequest. cache-write returns the persisted entry
identity, and cache-read returns null when an entry does not exist or is
expired and include_expired is false.
Verified provider answer cache writes use
VerifiedAnswerCacheWriteCommandRequest, which wraps the original
KnowledgeQueryRequestInput, provider answer_text, and
answer_generation_scope with provider/model/model version identifiers. The
nested query must include verification_token; if it also includes
query.answer_generation_scope, the two scopes must match. Memory rebuilds the
context package, validates groundedness, writes the provider/model/model-version
scoped verified cache row, and returns VerifiedAnswerCacheWriteResult with
only cache metadata and hashes.
Operational command request files use the Rust API payloads
OperationalMemoryWriteBatch, OperationalMemoryReadRequest, and
MemoryRecordPruneRequest. The operational-memory schemas accept both the
current LocalFirst shape and the profile-aware shared PostgreSQL shape so
callers can prepare requests without changing command names.
Adapter Contract Pack
Versioned adapter contracts live under contracts.
The pack publishes JSON schemas in contracts/schemas/v1 and sanitized golden
examples in contracts/examples/v1 for query, context-package, supplied HyDE,
doctor, RAG quality reports, DET context evidence exports, persistent cache,
reusable memory records, local operational memory, and CLI error envelopes.
nettoolskit.manifest.json references these files through its contracts
array so orchestrators can discover request and response shapes without
linking to memory internals.
Consumers should call ntk-memory doctor --repo-root <path> before expensive
query paths, or ntk-memory doctor --repo-root <path> --request <path> when
checking a request-selected shared PostgreSQL profile. They should then call
ntk-memory query or ntk-memory context-package with request JSON that
matches the published schema. Runtime outputs, stdout, stderr, and reports
should be stored as harness evidence instead of reading the SQLite or
PostgreSQL store directly.
doctor is designed for orchestrators and operators. It requires an explicit
repository root. Without a request file it initializes the real
.build/knowledge/store/knowledge.db store, validates persistent-cache
table/index/TTL/prune readiness, registers the Memory-owned SQLite vector
function lane, and returns a stable LocalFirst readiness report. With a
memory-doctor-request file that selects shared_postgresql, it validates only
secret-safe settings, probes shared PostgreSQL pgvector, persistent-cache, and
operational-memory readiness, and does not bootstrap the LocalFirst SQLite
store. Shared PostgreSQL readiness failures are reported as degraded JSON
with exit code zero so the caller can decide whether to fail, retry, or fall
back before running an expensive query path.
Doctor Readiness JSON
The readiness report is intentionally repo-relative and safe for generated artifacts. It does not expose workstation absolute paths. The current schema includes:
| Field | Purpose |
|---|---|
schemaVersion |
Readiness report contract version. |
service |
Memory service name, kind, version, manifest schema, and binary name. |
scope |
Repo-relative scope and effective persistence profile. |
store |
Profile backend readiness. LocalFirst reports .build/knowledge/store; shared PostgreSQL reports the safe shared_postgresql backend token. |
sqlite |
SQLite engine version and LocalFirst vector semantic query capability. |
sharedPostgresql |
Optional shared PostgreSQL readiness metadata containing only env-var name, schema, application name, and readiness state. |
capabilities |
Capability readiness states for ingest, query, context package, RAG quality report, persistent cache, and local memory. |
readiness |
Enterprise lane matrix for rag, cag, hyde_assisted, context_package, verified_answer_cache, operational_memory, and quality_observability, including required and degraded check ids. |
checks |
Ordered deterministic checks with blocking or degrading severity, including persistent-cache table, index, TTL, and round-trip checks. |
summary |
Aggregated ready/degraded flags and failed check IDs. |
Readiness lane status uses a strict machine contract:
readymeans every required check for that lane passed.degradedmeans the lane can still run through a local-first fallback, but the caller should inspectdegradedCheckIdsbefore selecting it.unavailablemeans the orchestrator must not select that lane.
requiredCheckIds lists the checks that control a lane. degradedCheckIds
must be a subset of those required checks and names the failed checks from the
same report.
Compatibility and Support
The initial compatibility target is the existing implementation in
nettoolskit-copilot. Extraction must preserve the current command and runtime
behavior until downstream consumers are migrated.
The current crate starts from the existing nettoolskit-copilot knowledge
foundation and keeps the same deterministic behavior while the ecosystem moves
to explicit memory-owned APIs.
build_hyde_hypothesis remains backward-compatible for existing callers while
new integrations can attach a typed local HydeGenerator.
Downstream consumers should prefer nettoolskit_memory::engine for new adapter
work. The facade keeps grouped surfaces for contracts, persistence, retrieval,
orchestration, context packages, vault materialization, and authored knowledge
records without exposing runtime orchestration policy.
Embedding provider selection belongs to the calling product runtime. This
repository owns the deterministic provider contract and local execution wiring;
external provider implementations, credential resolution, and model routing
belong outside nettoolskit-memory.
The local-first SQLite query path is capability-gated. Materialization
initializes the repository-local store with bundled SQLite and registers
deterministic Memory-owned vector SQL functions on each opened knowledge-store
connection. Operators and orchestrators should call ntk-memory doctor --repo-root <path> before invoking semantic query workloads.
Vector retrieval is scoped by provider_id and provider_version, so multiple
same-dimension embeddings for the same chunk can coexist without cross-provider
or cross-version result contamination. Verified answer reuse is separately
scoped by answer_generation_scope.provider_id, .model_id, and
.model_version, so generated answers are never reused across provider/model
surfaces.
SemanticRetriever::query_with_cache first checks the retrieval-result cache.
On retrieval-result misses, it resolves the query vector through the
query_embedding cache before calling the embedding provider. Query embedding
entries store provider identity, dimension, query hash, vector, and a stable
embedding cache key only.
Automatic persistent cache reuse is selected through the repository resolver.
LocalFirst uses the repository-local SQLite backend. Shared PostgreSQL uses the
feature-gated PostgresPersistentCacheBackend when callers provide
secret-safe shared settings. Custom in-memory repository resolvers remain
SQLite-free unless they explicitly attach a persistent-cache backend.
Reusable Memory OS record persistence is selected through the same resolver.
LocalFirst uses SqliteMemoryRecordBackend; shared PostgreSQL uses
PostgresMemoryRecordBackend and stores memory_items plus
memory_promotions as validated JSONB payloads with indexed filter columns.
The memory-record-* command requests accept the same
persistence_profile/shared_postgresql shape and fail closed when shared
settings are incomplete or the Cargo feature is unavailable.
Operational-memory local-write and local-read contracts accept the same
profile/settings shape and route through SqliteOperationalMemoryBackend or
PostgresOperationalMemoryBackend. The LocalFirst form without profile fields
remains the compatibility baseline.
The Rust API and CLI expose an optional shared-postgresql feature for
explicit shared deployments through
PersistentKnowledgeRepositoryResolver::for_shared_postgresql and workspace
request profile selection. Bare PersistenceProfile::SharedPostgresql still
fails closed before SQLite bootstrap because it has no secret-safe settings.
CLI request JSON rejects unknown fields so misspelled profile fields such as
persistenceProfile cannot be silently ignored and executed as LocalFirst.
Persistent-cache readiness and semantic vector readiness are separate gates.
memory.cache.persist, memory.knowledge.query, and memory.context.package
are reported independently so orchestrators can fail closed on the exact lane
they need.
Persistent cache payload hashes are built from canonical JSON so object order
and floating-point hydration differences do not invalidate durable cache rows
after they are read back from SQLite or PostgreSQL.
Operations
Generated build output belongs under .build/ and must not be committed.
Generated validation, packaging, publication, or release-candidate artifacts
belong under .deployment/artifacts/ and must not be committed unless the
repository-specific release process explicitly publishes sanitized outputs.
The CI workflow runs automatically only for pushes to main and can also be
started with workflow_dispatch. It uses changed-file routing so docs-only
changes do not run Rust build or test stages.
Planning
Use planning/ for active plans, workstream status, release checklists, and
implementation checkpoints when the repository owns planning state.
Planning files must stay checklist-backed and evidence-backed. Close plans only after the implementation, validation, review, and release or PR evidence are recorded.
Completed enterprise RAG quality work is tracked in
planning/completed/2026-05/202605210848-plan-memory-enterprise-rag-quality-pipeline.md.
That plan covers Hybrid Search V2, reranking, contextual compression,
groundedness primitives, enterprise filters, CAG key scoping, and RAG quality
observability while keeping orchestration in nettoolskit-copilot. Hybrid
lexical ranking is BM25-style deterministic scoring and fusion; full
storage-backed search engines and conditional model decisions remain caller
architecture choices.
Governance and Security
Use semantic branches such as feat/..., fix/..., docs/...,
refactor/..., test/..., chore/..., ci/..., and release/....
Do not publish directly to main unless the repository owner explicitly asks
for that operation. Prefer PRs for all repository changes.
Do not commit secrets, tokens, usernames, database hostnames, connection
strings, machine-local paths, raw logs, local SQLite stores, vector indexes,
graph stores, .env files, or unsanitized release artifacts. Shared PostgreSQL
examples must reference environment variable names only. Release assets must be
reviewed for metadata leaks before publication.
Build and Tests
Use targeted local validation for crate changes.
cargo metadata --locked --format-version 1 --no-deps
cargo fmt --all --check
cargo clippy --locked --all-targets
cargo test --locked --no-fail-fast
cargo test --locked --features shared-postgresql --all-targets --no-fail-fast
git diff --check
Contributing
Work from a semantic branch and keep changes scoped to this repository's
responsibility. Update CHANGELOG.md for user-visible, contract-level,
release, CI/CD, governance, or dependency changes.
Keep generated outputs under .build/ and local artifacts under
.deployment/artifacts/.
Dependencies
- Runtime: none in the foundation PR.
- Development: GitHub template baseline.
References
License
This project is licensed under the MIT License. See the LICENSE file at the repository root for details.
npm and npx
This repository exposes the ntk-memory command through the @nettoolskit/memory npm package for local installation and npx execution.
npm install -g @nettoolskit/memory
ntk-memory --help
npx @nettoolskit/memory --help
The npm wrapper executes a native ntk-memory binary from npm/native/ when release packaging stages one. For development or private release validation, set NTK_MEMORY_BINARY to an already built local binary. Use npm run stage:native -- <path-to-binary> before packaging a native tarball.
Package publication
Current preview package: @nettoolskit/memory@0.0.1-preview.2. Use npx @nettoolskit/memory for ephemeral execution or ntk-memory after global installation. GitRiver stage river/publish runs package dry-run validation on pull requests and publishes only from main or v* source refs through scripts/ci/river/npm-publish.sh.