# Manuals Qdrant Readiness

## Purpose
- The long-term source of truth for this pipeline is now the shared `manuals-platform` package at the workspace root.
- The RMV repo keeps this document as a consumer-side reference for the tenant-filtered artifacts Rocky reads.

## Source inputs
- Shared package location: `../manuals-platform`
- Shared build outputs: `../manuals-platform/output/full/*`
- Rocky tenant outputs: `../manuals-platform/output/tenants/rocky-mountain-vending/*`

## What the corpus builder does
- The shared package scans the full portfolio manual set, classifies every PDF, assigns tenant entitlements, and publishes tenant-filtered Qdrant-ready artifacts.
- It keeps `public_safe` and `internal_tech` retrieval profiles on top of one central corpus.
- Rocky consumes the prebuilt Rocky tenant export instead of rebuilding from raw manuals data inside the app.

## Build and evaluation commands
- Build artifacts:
  - `pnpm manuals:qdrant:build`
- Build artifacts into a custom directory:
  - `pnpm manuals:qdrant:build -- --output-dir /absolute/path`
- Run the evaluation set:
  - `pnpm manuals:qdrant:eval`

## Artifact output
- Default output directory: `output/manuals-qdrant`
- Important files:
  - `summary.json`
  - `manuals.json`
  - `chunks.json`
  - `chunks-high-confidence.json`
  - `chunks-public-safe.json`
  - `chunks-internal-tech.json`
  - `evaluation-cases.json`
  - `evaluation-report.json`

## Operational notes
- The first Qdrant prototype should ingest `chunks-high-confidence.json` or `chunks-internal-tech.json`, not the full raw corpus.
- Public-facing experiences should stay on `public_safe` filters even after Qdrant is introduced.
- After manuals-data changes, rebuild the artifacts so the new normalized corpus and evaluation report stay in sync.