# Manuals Qdrant Readiness ## Purpose - The long-term source of truth for this pipeline is now the shared `manuals-platform` package at the workspace root. - The RMV repo keeps this document as a consumer-side reference for the tenant-filtered artifacts Rocky reads. ## Source inputs - Shared package location: `../manuals-platform` - Shared build outputs: `../manuals-platform/output/full/*` - Rocky tenant outputs: `../manuals-platform/output/tenants/rocky-mountain-vending/*` ## What the corpus builder does - The shared package scans the full portfolio manual set, classifies every PDF, assigns tenant entitlements, and publishes tenant-filtered Qdrant-ready artifacts. - It keeps `public_safe` and `internal_tech` retrieval profiles on top of one central corpus. - Rocky consumes the prebuilt Rocky tenant export instead of rebuilding from raw manuals data inside the app. ## Build and evaluation commands - Build artifacts: - `pnpm manuals:qdrant:build` - Build artifacts into a custom directory: - `pnpm manuals:qdrant:build -- --output-dir /absolute/path` - Run the evaluation set: - `pnpm manuals:qdrant:eval` ## Artifact output - Default output directory: `output/manuals-qdrant` - Important files: - `summary.json` - `manuals.json` - `chunks.json` - `chunks-high-confidence.json` - `chunks-public-safe.json` - `chunks-internal-tech.json` - `evaluation-cases.json` - `evaluation-report.json` ## Operational notes - The first Qdrant prototype should ingest `chunks-high-confidence.json` or `chunks-internal-tech.json`, not the full raw corpus. - Public-facing experiences should stay on `public_safe` filters even after Qdrant is introduced. - After manuals-data changes, rebuild the artifacts so the new normalized corpus and evaluation report stay in sync.