npm.io
0.0.1-preview.2 • Published 2d agoCLI

@nettoolskit/memory

Licence
MIT
Version
0.0.1-preview.2
Deps
0
Size
7.0 MB
Vulns
0
Weekly
101
Install scriptsThis package runs scripts during installation (preinstall/install/postinstall)

NetToolsKit Memory

Knowledge ingestion, local memory, RAG, HyDE, retrieval, context packages, and vault materialization for NetToolsKit.


Introduction

nettoolskit-memory is the dedicated NetToolsKit repository for memory and knowledge workflows that are currently being extracted from the historical nettoolskit-copilot umbrella workspace.

This repository owns reusable memory capabilities and integrates with the rest of the NetToolsKit ecosystem through explicit contracts, manifests, APIs, and generated artifacts. It does not own agent instruction policy, machine control, code generation, DevOps deployment, assurance testing, or product orchestration.


Features

  • Knowledge record ingestion and normalized document contracts.
  • Chunking, deterministic embedding contracts, injectable local embedding provider boundary, provider-scoped vector retrieval, graph expansion, and bounded context package assembly.
  • Memory-owned LocalFirst SQLite vector functions for deterministic semantic query readiness without a native extension dependency in the default build.
  • Typed HyDE generator contract with deterministic default generation, stable generator id/version provenance, orchestrator-supplied hypothesis intake, cache-key isolation, and local-first query expansion.
  • Request-bounded reusable Memory OS recall for validated or published semantic/procedural records inside context packages and query answers.
  • Repository-local operational memory for sessions, events, artifacts, and replayable recall.
  • Typed local operational memory APIs for sessions, runs, tool calls, artifacts, validations, route decisions, and selected memory-record pruning.
  • Durable CAG/cache contracts, SQLite persistence, and automatic LocalFirst reuse for retrieval results, context packages, rerank results, and groundedness-linked verified answers across process boundaries.
  • Optional feature-gated shared PostgreSQL profile with pgvector-backed vector search, PostgreSQL-backed graph expansion, persistent CAG/cache reuse, reusable Memory OS records, and operational-memory persistence for enterprise deployments.
  • Vault materialization for grouped Markdown knowledge records and generated vault manifests.
  • Explicit memory-engine facade for downstream command adapters and product runtimes.
  • Contract-aligned nettoolskit.manifest.json for service discovery by NetToolsKit orchestrators and control surfaces.
  • Local ntk-memory CLI commands for manifest rendering, workspace materialization, vault materialization, and bounded knowledge queries.
  • Optional caller-gated verified answer cache reuse for ntk-memory query request files that provide a verification token and pass deterministic groundedness validation.
  • Dedicated ntk-memory context-package command for RAG/CAG/HyDE context assembly without answer text.
  • Hybrid Search V2 contracts for deterministic BM25-style lexical scoring, normalized vector plus lexical fusion, field-match explainability, and bounded context-package candidate windows.
  • Deterministic reranker contracts with score-component explainability, request-activated execution, and typed rerank-result cache keys.
  • Contextual compression contracts that retain source-backed spans, citations, source text hashes, matched query terms, and compression metrics without generating answer text.
  • Deterministic groundedness validation for caller-provided answer text against retained context-package evidence, with supported terms, unsupported terms, evidence matches, and machine-readable reason codes.
  • Content-free RAG quality observability contracts for stage latency, counts, cache hit/miss, scores, compression ratio, groundedness score, and bounded reason codes without raw prompts, answers, snippets, or evidence text.
  • Sanitized DET context evidence export contract, golden example, fixture, and runbook for local effectiveness evidence, including DET use_det, hold, and blocked package decisions plus sanitized metrics host/provider binding without raw prompts, secrets, provider payloads, local paths, or unredacted user content.
  • Local ntk-memory CLI commands for operational memory write, read, and prune operations.
  • Profile-aware ntk-memory memory-record-* CLI commands for reusable Memory OS record write, read, promotion, and prune operations.
  • Deterministic ntk-memory doctor readiness checks for LocalFirst and shared PostgreSQL stores, persistent CAG/cache table and indexes, round-trip cache writes, query lanes, and enterprise memory-lane selection.
  • Experimental Rust crate implementation extracted from the historical nettoolskit-copilot knowledge foundation.

Contents


Architecture

flowchart TD
    agent[nettoolskit-agent]
    memory[nettoolskit-memory]
    copilot[nettoolskit-copilot]
    control[nettoolskit-control]
    rust[nettoolskit-rust]
    docs[docs/knowledge-base]
    store[.build/knowledge/store]
    local[.temp/context-memory]
    vault[.deployment/artifacts/knowledge-vault]

    agent -->|policy and validation rules| copilot
    copilot -->|runtime decisions| memory
    copilot --> control
    memory --> rust
    memory --> docs
    memory --> store
    memory --> local
    memory --> vault

nettoolskit-memory is the memory execution boundary. It consumes policy and contracts from other repositories but keeps ingestion, retrieval, persistence, and context-package behavior behind memory-owned APIs.

The current Rust implementation is intentionally extracted without deleting the historical nettoolskit-copilot sources. A later compatibility-adapter PR will decide when nettoolskit-copilot starts consuming this repository directly.


Responsibility Boundary

Repository Owns
nettoolskit-agent Instructions, prompts, governance, validation, model-routing policy, context-economy policy, and memory/RAG/HyDE usage rules.
nettoolskit-memory Knowledge ingestion, chunking, embeddings, retrieval, graph expansion, HyDE, context packages, local memory, and vault materialization.
nettoolskit-copilot Runtime orchestration, prompt middleware, model selection, approval coordination, and product workflow composition.
nettoolskit-control Machine workers, leases, command execution, approvals, and evidence collection.
nettoolskit-codegen Code generation, scaffolding, refactor planning, and architecture validation.
nettoolskit-assurance Security, performance, benchmark, OpenAPI, frontend, and quality gate validation.
nettoolskit-devops Docker, images, deployment, backup, observability, proxy, and operations.
nettoolskit-rust Generic Rust core, contracts, CLI, observability, adapters, validation, and specification foundations.

Source Extraction Map

Initial source material comes from nettoolskit-copilot:

Current source Target ownership
crates/knowledge Move to nettoolskit-memory as the primary knowledge and retrieval crate.
crates/memory/engine Replace with nettoolskit_memory::engine as the memory facade.
crates/control/runtime/src/continuity/local_context.rs Split memory-owned persistence/retrieval from runtime command orchestration.
crates/control/runtime/src/maintenance/prune_local_memory.rs Move local memory retention logic to nettoolskit-memory; keep runtime command wiring in the caller.
docs/knowledge-base/** Keep durable authored knowledge records versioned in the owning product repository; memory provides validation and processing APIs.

Do not delete or deprecate the existing nettoolskit-copilot sources until the new memory crate has mirrored tests, a compatibility adapter, and a confirmed consumer integration path.


Storage Model

Generated stores are rebuildable runtime artifacts and must not be committed.

Store Purpose
.build/knowledge/store/knowledge.db Rebuildable knowledge store for ingestion, vector retrieval, durable graph state, persistent CAG/cache entries, and materialized workspace state.
Shared PostgreSQL schema External shared store selected only through SharedPostgresqlSettings and a connection-string environment variable.
.temp/context-memory/context.db Rebuildable local operational memory for sessions, events, artifacts, and recall.
.deployment/artifacts/knowledge-vault/** Generated vault and validation artifacts for review or publication.
docs/knowledge-base/** Durable authored knowledge records owned by the product or domain repository.

The default Rust crate writes local operational-memory, graph, and persistent CAG/cache API data to the repository-local SQLite schema under .build/knowledge/store/knowledge.db. The optional shared-postgresql Cargo feature adds SharedPostgresqlSettings, PostgresVectorRepository, PostgresGraphRepository, PostgresPersistentCacheBackend, PostgresMemoryRecordBackend, and PostgresOperationalMemoryBackend for explicit shared deployments. Shared PostgreSQL configuration names the environment variable that contains the connection string; the value itself must never be committed, logged, or emitted in artifacts. Shared operational memory for sessions, runs, tool calls, artifacts, validations, and route decisions uses the same persistence_profile/shared_postgresql request shape in the contract pack. Legacy .temp/context-memory/context.db remains listed as extraction source material until downstream runtime adapters move to the memory-owned API.


Persistent CAG Cache API

The persistent cache API is the durable boundary for reusable CAG state. It stores typed cache entries as JSON payloads with deterministic keys, stable payload hashes, creation timestamps, optional expiration timestamps, and schema versions.

The API exposes:

Contract Purpose
PersistentCacheEntry Validated durable cache row for retrieval results, context packages, or verified idempotent results.
PersistentCachePolicy Materialized-workspace policy that enables/disables durable cache kinds and sets per-kind TTLs.
PersistentCacheReadRequest Reads one cache entry and treats expired rows as misses unless include_expired is explicit.
PersistentCachePruneRequest Selects entries by kind, expiration, schema version, or payload hash and supports dry-run mode.
persist_persistent_cache_entry Upserts one validated cache entry into the repository-local SQLite store.
read_persistent_cache_entry Reads one cache entry through the TTL-aware typed read path.
prune_persistent_cache_entries Deletes or previews selected cache rows without exposing a shell-like deletion surface.
probe_persistent_cache_readiness Validates the persistent-cache table, required indexes, TTL-aware reads, and doctor-owned write/read/prune round-trip.

This API does not decide which model, prompt, or orchestration lane should use a cached value. That decision remains in nettoolskit-copilot; this repository owns the deterministic storage and validation surface. Within nettoolskit-memory, materialized workspaces automatically read and write durable cache entries when the selected repository resolver provides a PersistentCacheBackend. LocalFirst attaches the SQLite backend by default; the feature-gated shared PostgreSQL profile attaches PostgresPersistentCacheBackend when explicit safe settings are provided. The same resolver selects the reusable Memory OS backend, so materialization can publish and recall reusable memory records through SQLite or shared PostgreSQL without changing caller code. Both profiles use a workspace content fingerprint to avoid stale reuse after source changes.

Automatic LocalFirst cache writes use kind-specific TTLs by default:

Cache kind Default TTL
query_embedding 21,600 seconds
retrieval_result 21,600 seconds
rerank_result 21,600 seconds
context_package 86,400 seconds
verified_idempotent_result 604,800 seconds

PersistentCachePolicy can disable individual durable cache kinds or set a custom positive TTL. A ttl_seconds value of null keeps that kind enabled without an automatic expiration timestamp. Manual ntk-memory cache-write requests remain explicit PersistentCacheEntry payloads and can still provide their own expires_at_unix_seconds per entry.

Retrieval and context-package cache keys include an explicit enterprise filter-scope fingerprint. Tenant, access role, document version, domain, publication date windows, content root, source path, language, document kind, and lexical shortlist filters therefore cannot reuse cache entries across different retrieval scopes. Context-package cache keys also include the HyDE generator id and version when HyDE mode is active, the contextual compression profile scope when compression is enabled, and the rerank profile plus result budget when reranking is enabled, so generator, compression, or rerank changes do not reuse stale packages from a different evidence shape.

Query embedding cache keys are narrower by design. They include provider id, provider version, vector dimension, and a query text hash, but never the raw query text. This allows the same query vector to be reused across different tenant/filter/budget scopes while retrieval-result cache entries remain scoped to the full request.

Rerank-result cache keys include reranker id, reranker version, profile id, query text, result budget, and a fingerprint of the candidate window. The deterministic reranker runs when KnowledgeQueryRequestInput sets enable_rerank: true with a positive max_rerank_hits; selecting a model-assisted reranker or fallback policy remains the responsibility of the calling orchestrator.

Verified answer reuse is always caller-gated. The memory service stores and reuses verified idempotent answer text only when the query request includes a non-empty verification_token and deterministic groundedness validation accepts the answer against the retained context package. Verified answer cache keys also include answer_generation_scope: the safe provider/model/model version surface that produced the answer candidate. This prevents verified answers generated by different providers, models, or model versions from reusing the same cache row. The durable payload is a VerifiedAnswerCacheEntry with answer hash, context-package hash, verification-token fingerprint, answer generation scope, and the accepted groundedness result. Provider answers produced outside memory must be persisted through ntk-memory verified-answer-cache-write, which rebuilds the query context and computes Memory-owned cache keys and hashes instead of asking callers to submit low-level PersistentCacheEntry rows. The command output contains only cache metadata and hashes; it does not echo answer text or verification tokens. Deciding when to call providers, retry, degrade, or expose cached answers remains in the calling orchestrator.

Provider and model identifiers in answer_generation_scope are cache scope tokens, not configuration channels. They must be stable, non-secret identifiers and must not contain credentials, bearer tokens, prompts, answer text, transcripts, request headers, customer data, local filesystem paths, database URLs, or other environment-specific secrets.

ntk-memory doctor uses a controlled persistent-cache probe to verify the table, indexes, TTL miss behavior, and write/read/prune behavior. The probe removes its doctor-owned row before reporting success and does not prune runtime cache rows.


Local Operational Memory API

The local operational memory API is adapter-facing and deterministic. It writes sessions, runs, tool calls, artifacts, validations, and route decisions through OperationalMemoryWriteBatch. Batch writes are atomic: if one child row fails, the full write is rolled back.

OperationalMemoryReadRequest applies session and run filters as an intersection. Session filters constrain matching runs and child rows; run filters constrain parent sessions and child rows. Route decisions are filtered independently by route_kind.

Operational payloads must be sanitized before persistence. The crate rejects null payloads, artifact paths outside repository-relative form, common local path markers, and secret-like token markers. Runtime adapters must summarize large logs before calling this API.

MemoryRecordPruneRequest requires at least one explicit selection filter and supports dry-run mode. Non-dry-run prune deletes selected memory records and their source-memory promotion rows in one transaction.


Control Plane Model

nettoolskit-memory exposes deterministic memory capabilities through typed contracts and command-ready APIs. It does not decide which model to use, when to call an LLM, which worker executes a task, or whether an approval gate passes.

Runtime orchestration belongs to nettoolskit-copilot. Machine execution belongs to nettoolskit-control. Generic CLI building blocks belong to nettoolskit-rust.


Rust Crate

This repository uses the product-repository single-crate layout.

Path Package Responsibility
src/** nettoolskit-memory Knowledge ingestion, retrieval, context packages, local memory, and vault materialization.
src/engine/** nettoolskit-memory Grouped public facade for downstream adapters and product runtimes.
src/local/** nettoolskit-memory Local operational memory contracts for sessions, runs, artifacts, validations, route decisions, and prune requests.
tests/** nettoolskit-memory Mirrored integration tests for the source modules.

Cargo build output is configured under .build/target through .cargo/config.toml.

The crate exposes the typed HyDE generator boundary through HydeGenerator, DeterministicHydeGenerator, SuppliedHydeGenerator, HydeHypothesis, and SuppliedHydeHypothesis, with generator id/version provenance retained in context package output. Supplied HyDE is a data-only contract for external orchestrators such as nettoolskit-copilot; memory validates the query match and provenance fields, then uses the hypothesis for retrieval and cache scope. The cache scope preserves the external generator id/version and adds a hypothesis-text fingerprint so different supplied hypotheses cannot reuse the same context package accidentally. Prompt templates, model routing, credentials, retries, and orchestration policy remain outside this repository.

The default local orchestrator uses HashBucketEmbeddingProvider for deterministic development and test runs. Downstream Rust consumers can inject a custom local EmbeddingProvider through LocalKnowledgeOrchestrator while keeping provider id/version provenance in embeddings, retrieval explanations, and cache keys. This crate does not implement network embedding providers or own provider secrets.


Local CLI

The repository builds a local-process binary named ntk-memory. It is the preferred adapter boundary for product orchestrators that should call memory capabilities without linking to memory internals.

Supported commands:

Command Purpose
ntk-memory manifest [--output <path>] Render the service manifest as deterministic JSON.
ntk-memory materialize --request <path> [--output <path>] Ingest and index a bounded workspace through the request-selected memory persistence profile.
ntk-memory query --request <path> [--output <path>] Run vector retrieval, graph expansion, HyDE, reusable memory recall, context package assembly, profile-backed persistence, and optional caller-gated verified answer reuse.
ntk-memory context-package --request <path> [--output <path>] Assemble a bounded context package with retrieval evidence, reusable memory recall, and profile-backed cache metadata without answer text.
ntk-memory groundedness-validate --request <path> [--output <path>] Validate caller-provided answer text against retained context-package evidence without model routing or answer generation.
ntk-memory verified-answer-cache-write --request <path> [--output <path>] Rebuild the query context, validate a provider answer, and persist verified answer cache state with Memory-owned keys and hashes.
ntk-memory vault-materialize --request <path> [--output <path>] Materialize grouped Markdown knowledge vault records and a vault manifest.
ntk-memory doctor --repo-root <path> [--request <path>] [--output <path>] Report LocalFirst or request-selected shared PostgreSQL runtime readiness for enterprise RAG/CAG/HyDE/context/verified-cache/operational-memory lanes.
ntk-memory cache-write --repo-root <path> --request <path> [--output <path>] Persist one durable CAG/cache entry.
ntk-memory cache-read --repo-root <path> --request <path> [--output <path>] Read one durable CAG/cache entry with TTL-aware miss semantics.
ntk-memory cache-prune --repo-root <path> --request <path> [--output <path>] Prune selected durable CAG/cache entries with explicit filters and dry-run support.
ntk-memory local-write --repo-root <path> --request <path> [--output <path>] Persist a repository-local operational memory write batch.
ntk-memory local-read --repo-root <path> --request <path> [--output <path>] Read repository-local operational memory rows with deterministic filters.
scripts/operations/write-det-local-first-use-memory.ps1 Write .deployment/artifacts/memory/det-local-first-use-write.json for DET runtime-wrapper readiness.
ntk-memory local-prune --repo-root <path> --request <path> [--output <path>] Prune selected Memory OS records with explicit filters and dry-run support.
ntk-memory memory-record-write --repo-root <path> --request <path> [--output <path>] Persist validated reusable Memory OS records through the request-selected profile.
ntk-memory memory-record-read --repo-root <path> --request <path> [--output <path>] Read reusable Memory OS records and optional promotion metadata through the request-selected profile.
ntk-memory memory-record-promote --repo-root <path> --request <path> [--output <path>] Persist an explicit Memory OS promotion and its target record through the request-selected profile.
ntk-memory memory-record-prune --repo-root <path> --request <path> [--output <path>] Prune selected Memory OS records through the public profile-aware record adapter surface.

Request files use the Rust API payloads KnowledgeWorkspaceRequestInput, KnowledgeQueryRequestInput, and KnowledgeVaultBuildRequestInput. The query command returns answer text plus the context package; context-package returns only retrieval package data and cache metadata. vault-materialize uses output_root from the request JSON as the Markdown vault destination; the CLI --output flag is only the machine-readable JSON report destination. Outputs are machine-readable JSON. When --output is omitted, the command writes JSON to stdout.

Workspace-backed commands default to local_first. Callers can select shared_postgresql by adding persistence_profile: "shared_postgresql" and a secret-safe shared_postgresql settings object to the workspace request. The settings object names the environment variable containing the connection string and never carries the secret value itself. If the optional shared-postgresql Cargo feature or required environment variable is missing, the command fails closed with a sanitized profile/readiness error and does not bootstrap the repository-local SQLite store.

KnowledgeQueryRequestInput can opt into reusable Memory OS recall with enable_reusable_memory_recall: true and a positive max_reusable_memory_hits budget. When enabled, local-first materialization loads validated or published semantic and procedural records from memory_items; recalled records appear as memory_record snippets, evidence, and the optional reusable_memory_recall output block. The cache key includes a reusable-memory scope fingerprint so changed memory records cannot reuse a stale context package.

KnowledgeQueryRequestInput can also apply enterprise retrieval scope filters: tenant_id, access_roles, document_version, domain, min_published_at_unix_seconds, and max_published_at_unix_seconds. These filters are optional for compatibility, but when a caller supplies them memory fails closed: a chunk without matching metadata is excluded from lexical prefiltering, vector retrieval, context packages, and cache reuse.

When enable_lexical_prefilter is true with a positive max_lexical_hits, memory scores the lexical corpus with the LexicalScoringProfile contract and applies the same enterprise filters before scoring. Context-package assembly then uses HybridFusionProfile to normalize and fuse vector and lexical scores, emitting vector_score, lexical_score, fusion_score, ranking stages, matched lexical fields, and the fusion profile in retrieval explanations. Default behavior remains semantic-only unless the request enables lexical prefiltering.

Contextual compression is exposed as a memory-owned package assembly primitive. When a caller supplies context_compression_profile, memory replaces package snippets with retained source spans and emits compression.retained_spans plus compression.metrics. The retained text is copied from source snippets only; model routing, deciding whether compression is worth the latency/cost tradeoff, and answer generation remain caller responsibilities.

Deterministic reranking is request-activated. When enable_rerank is true, memory applies rerank_profile or the default deterministic profile after semantic or hybrid retrieval, caps output with max_rerank_hits, emits the optional context_package.rerank block, and updates retained snippets and evidence ordering before optional compression.

Groundedness validation is exposed as GroundednessValidationRequest, GroundednessValidationResult, and ntk-memory groundedness-validate. The validator compares caller-provided answer text against retained context-package snippets or compression spans, emits supported and unsupported terms plus reason codes, and never calls an LLM. Retry, degrade, or final-answer policy remains owned by the caller.

RAG quality observability is exposed as RagQualityReport, RagQualityStageMetric, and the rag-quality-report contract. Reports are intentionally content-free: they use hashes for request, tenant, and query scope and only store stage status, latency, counts, cache hit/miss, normalized scores, compression ratio, token/cost counters, profile ids, and bounded reason codes. Prompt text, answer text, HyDE text, snippets, retained spans, evidence text, workstation paths, and secret values must stay out of these reports. query and context-package attach quality_report only when emit_quality_report is true.

DET context evidence export is represented by det-context-evidence-export. It is a pending contract and fixture for DET, Harness, and Analytics local effectiveness evidence derived from existing context-package, quality-report, groundedness, and cache metadata. The export is content-free by contract and uses only hashes, counts, booleans, bounded reason codes, safe identifiers, and normalized scores. It must not include prompts, query text, snippets, retained spans, provider payloads, answer text, secrets, local paths, or unredacted user content. The optional effectiveness.det_local_effectiveness_package object models DET's local effectiveness bridge with a use_det, hold, or blocked decision and sanitized metrics binding where the host is hashed and the metrics provider is represented only by safe identifiers.

For LocalFirst workspaces, query embeddings, retrieval results, context packages, rerank results, and verified answers can reuse durable cache rows from previous processes. Public response status fields currently expose cache_status.retrieval_result_cache_hit, cache_status.context_package_cache_hit, and cache_status.verified_idempotent_result_cache_hit may reflect durable hits from a previous process. Verified answer reuse still requires an explicit verification_token in the KnowledgeQueryRequestInput JSON plus a validated groundedness-linked cache payload, so unverified or unsupported query calls never reuse answer text across processes.

CLI query and context-package commands use the deterministic HyDE generator by default and include HyDE generator provenance in returned context metadata when HyDE mode is active. Request JSON can instead provide supplied_hyde_hypothesis with query_text, hypothesis_text, generator_id, generator_version, and provenance. Memory rejects blank fields, rejects query mismatches, and accepts supplied HyDE only for HyDE-assisted retrieval.

Operational-memory commands keep payload hygiene fail-closed for local paths and secret-like values before any write reaches persistence. Their current LocalFirst request shape remains valid; the contract pack also publishes shared PostgreSQL examples for local-write and local-read using persistence_profile: "shared_postgresql" plus secret-safe shared_postgresql settings.

Persistent-cache command request files use the Rust API payloads PersistentCacheEntry, PersistentCacheReadRequest, and PersistentCachePruneRequest. cache-write returns the persisted entry identity, and cache-read returns null when an entry does not exist or is expired and include_expired is false.

Verified provider answer cache writes use VerifiedAnswerCacheWriteCommandRequest, which wraps the original KnowledgeQueryRequestInput, provider answer_text, and answer_generation_scope with provider/model/model version identifiers. The nested query must include verification_token; if it also includes query.answer_generation_scope, the two scopes must match. Memory rebuilds the context package, validates groundedness, writes the provider/model/model-version scoped verified cache row, and returns VerifiedAnswerCacheWriteResult with only cache metadata and hashes.

Operational command request files use the Rust API payloads OperationalMemoryWriteBatch, OperationalMemoryReadRequest, and MemoryRecordPruneRequest. The operational-memory schemas accept both the current LocalFirst shape and the profile-aware shared PostgreSQL shape so callers can prepare requests without changing command names.

Adapter Contract Pack

Versioned adapter contracts live under contracts. The pack publishes JSON schemas in contracts/schemas/v1 and sanitized golden examples in contracts/examples/v1 for query, context-package, supplied HyDE, doctor, RAG quality reports, DET context evidence exports, persistent cache, reusable memory records, local operational memory, and CLI error envelopes. nettoolskit.manifest.json references these files through its contracts array so orchestrators can discover request and response shapes without linking to memory internals.

Consumers should call ntk-memory doctor --repo-root <path> before expensive query paths, or ntk-memory doctor --repo-root <path> --request <path> when checking a request-selected shared PostgreSQL profile. They should then call ntk-memory query or ntk-memory context-package with request JSON that matches the published schema. Runtime outputs, stdout, stderr, and reports should be stored as harness evidence instead of reading the SQLite or PostgreSQL store directly.

doctor is designed for orchestrators and operators. It requires an explicit repository root. Without a request file it initializes the real .build/knowledge/store/knowledge.db store, validates persistent-cache table/index/TTL/prune readiness, registers the Memory-owned SQLite vector function lane, and returns a stable LocalFirst readiness report. With a memory-doctor-request file that selects shared_postgresql, it validates only secret-safe settings, probes shared PostgreSQL pgvector, persistent-cache, and operational-memory readiness, and does not bootstrap the LocalFirst SQLite store. Shared PostgreSQL readiness failures are reported as degraded JSON with exit code zero so the caller can decide whether to fail, retry, or fall back before running an expensive query path.

Doctor Readiness JSON

The readiness report is intentionally repo-relative and safe for generated artifacts. It does not expose workstation absolute paths. The current schema includes:

Field Purpose
schemaVersion Readiness report contract version.
service Memory service name, kind, version, manifest schema, and binary name.
scope Repo-relative scope and effective persistence profile.
store Profile backend readiness. LocalFirst reports .build/knowledge/store; shared PostgreSQL reports the safe shared_postgresql backend token.
sqlite SQLite engine version and LocalFirst vector semantic query capability.
sharedPostgresql Optional shared PostgreSQL readiness metadata containing only env-var name, schema, application name, and readiness state.
capabilities Capability readiness states for ingest, query, context package, RAG quality report, persistent cache, and local memory.
readiness Enterprise lane matrix for rag, cag, hyde_assisted, context_package, verified_answer_cache, operational_memory, and quality_observability, including required and degraded check ids.
checks Ordered deterministic checks with blocking or degrading severity, including persistent-cache table, index, TTL, and round-trip checks.
summary Aggregated ready/degraded flags and failed check IDs.

Readiness lane status uses a strict machine contract:

  • ready means every required check for that lane passed.
  • degraded means the lane can still run through a local-first fallback, but the caller should inspect degradedCheckIds before selecting it.
  • unavailable means the orchestrator must not select that lane.

requiredCheckIds lists the checks that control a lane. degradedCheckIds must be a subset of those required checks and names the failed checks from the same report.


Compatibility and Support

The initial compatibility target is the existing implementation in nettoolskit-copilot. Extraction must preserve the current command and runtime behavior until downstream consumers are migrated.

The current crate starts from the existing nettoolskit-copilot knowledge foundation and keeps the same deterministic behavior while the ecosystem moves to explicit memory-owned APIs.

build_hyde_hypothesis remains backward-compatible for existing callers while new integrations can attach a typed local HydeGenerator.

Downstream consumers should prefer nettoolskit_memory::engine for new adapter work. The facade keeps grouped surfaces for contracts, persistence, retrieval, orchestration, context packages, vault materialization, and authored knowledge records without exposing runtime orchestration policy.

Embedding provider selection belongs to the calling product runtime. This repository owns the deterministic provider contract and local execution wiring; external provider implementations, credential resolution, and model routing belong outside nettoolskit-memory.

The local-first SQLite query path is capability-gated. Materialization initializes the repository-local store with bundled SQLite and registers deterministic Memory-owned vector SQL functions on each opened knowledge-store connection. Operators and orchestrators should call ntk-memory doctor --repo-root <path> before invoking semantic query workloads. Vector retrieval is scoped by provider_id and provider_version, so multiple same-dimension embeddings for the same chunk can coexist without cross-provider or cross-version result contamination. Verified answer reuse is separately scoped by answer_generation_scope.provider_id, .model_id, and .model_version, so generated answers are never reused across provider/model surfaces.

SemanticRetriever::query_with_cache first checks the retrieval-result cache. On retrieval-result misses, it resolves the query vector through the query_embedding cache before calling the embedding provider. Query embedding entries store provider identity, dimension, query hash, vector, and a stable embedding cache key only.

Automatic persistent cache reuse is selected through the repository resolver. LocalFirst uses the repository-local SQLite backend. Shared PostgreSQL uses the feature-gated PostgresPersistentCacheBackend when callers provide secret-safe shared settings. Custom in-memory repository resolvers remain SQLite-free unless they explicitly attach a persistent-cache backend.

Reusable Memory OS record persistence is selected through the same resolver. LocalFirst uses SqliteMemoryRecordBackend; shared PostgreSQL uses PostgresMemoryRecordBackend and stores memory_items plus memory_promotions as validated JSONB payloads with indexed filter columns. The memory-record-* command requests accept the same persistence_profile/shared_postgresql shape and fail closed when shared settings are incomplete or the Cargo feature is unavailable.

Operational-memory local-write and local-read contracts accept the same profile/settings shape and route through SqliteOperationalMemoryBackend or PostgresOperationalMemoryBackend. The LocalFirst form without profile fields remains the compatibility baseline.

The Rust API and CLI expose an optional shared-postgresql feature for explicit shared deployments through PersistentKnowledgeRepositoryResolver::for_shared_postgresql and workspace request profile selection. Bare PersistenceProfile::SharedPostgresql still fails closed before SQLite bootstrap because it has no secret-safe settings. CLI request JSON rejects unknown fields so misspelled profile fields such as persistenceProfile cannot be silently ignored and executed as LocalFirst.

Persistent-cache readiness and semantic vector readiness are separate gates. memory.cache.persist, memory.knowledge.query, and memory.context.package are reported independently so orchestrators can fail closed on the exact lane they need. Persistent cache payload hashes are built from canonical JSON so object order and floating-point hydration differences do not invalidate durable cache rows after they are read back from SQLite or PostgreSQL.


Operations

Generated build output belongs under .build/ and must not be committed.

Generated validation, packaging, publication, or release-candidate artifacts belong under .deployment/artifacts/ and must not be committed unless the repository-specific release process explicitly publishes sanitized outputs.

The CI workflow runs automatically only for pushes to main and can also be started with workflow_dispatch. It uses changed-file routing so docs-only changes do not run Rust build or test stages.


Planning

Use planning/ for active plans, workstream status, release checklists, and implementation checkpoints when the repository owns planning state.

Planning files must stay checklist-backed and evidence-backed. Close plans only after the implementation, validation, review, and release or PR evidence are recorded.

Completed enterprise RAG quality work is tracked in planning/completed/2026-05/202605210848-plan-memory-enterprise-rag-quality-pipeline.md. That plan covers Hybrid Search V2, reranking, contextual compression, groundedness primitives, enterprise filters, CAG key scoping, and RAG quality observability while keeping orchestration in nettoolskit-copilot. Hybrid lexical ranking is BM25-style deterministic scoring and fusion; full storage-backed search engines and conditional model decisions remain caller architecture choices.


Governance and Security

Use semantic branches such as feat/..., fix/..., docs/..., refactor/..., test/..., chore/..., ci/..., and release/....

Do not publish directly to main unless the repository owner explicitly asks for that operation. Prefer PRs for all repository changes.

Do not commit secrets, tokens, usernames, database hostnames, connection strings, machine-local paths, raw logs, local SQLite stores, vector indexes, graph stores, .env files, or unsanitized release artifacts. Shared PostgreSQL examples must reference environment variable names only. Release assets must be reviewed for metadata leaks before publication.


Build and Tests

Use targeted local validation for crate changes.

cargo metadata --locked --format-version 1 --no-deps
cargo fmt --all --check
cargo clippy --locked --all-targets
cargo test --locked --no-fail-fast
cargo test --locked --features shared-postgresql --all-targets --no-fail-fast
git diff --check

Contributing

Work from a semantic branch and keep changes scoped to this repository's responsibility. Update CHANGELOG.md for user-visible, contract-level, release, CI/CD, governance, or dependency changes.

Keep generated outputs under .build/ and local artifacts under .deployment/artifacts/.


Dependencies

  • Runtime: none in the foundation PR.
  • Development: GitHub template baseline.

References


License

This project is licensed under the MIT License. See the LICENSE file at the repository root for details.

npm and npx

This repository exposes the ntk-memory command through the @nettoolskit/memory npm package for local installation and npx execution.

npm install -g @nettoolskit/memory
ntk-memory --help
npx @nettoolskit/memory --help

The npm wrapper executes a native ntk-memory binary from npm/native/ when release packaging stages one. For development or private release validation, set NTK_MEMORY_BINARY to an already built local binary. Use npm run stage:native -- <path-to-binary> before packaging a native tarball.

Package publication

Current preview package: @nettoolskit/memory@0.0.1-preview.2. Use npx @nettoolskit/memory for ephemeral execution or ntk-memory after global installation. GitRiver stage river/publish runs package dry-run validation on pull requests and publishes only from main or v* source refs through scripts/ci/river/npm-publish.sh.

Keywords