1.8 KiB
1.8 KiB
Manuals Qdrant Readiness
Purpose
- The long-term source of truth for this pipeline is now the shared
manuals-platformpackage at the workspace root. - The RMV repo keeps this document as a consumer-side reference for the tenant-filtered artifacts Rocky reads.
Source inputs
- Shared package location:
../manuals-platform - Shared build outputs:
../manuals-platform/output/full/* - Rocky tenant outputs:
../manuals-platform/output/tenants/rocky-mountain-vending/*
What the corpus builder does
- The shared package scans the full portfolio manual set, classifies every PDF, assigns tenant entitlements, and publishes tenant-filtered Qdrant-ready artifacts.
- It keeps
public_safeandinternal_techretrieval profiles on top of one central corpus. - Rocky consumes the prebuilt Rocky tenant export instead of rebuilding from raw manuals data inside the app.
Build and evaluation commands
- Build artifacts:
pnpm manuals:qdrant:build
- Build artifacts into a custom directory:
pnpm manuals:qdrant:build -- --output-dir /absolute/path
- Run the evaluation set:
pnpm manuals:qdrant:eval
Artifact output
- Default output directory:
output/manuals-qdrant - Important files:
summary.jsonmanuals.jsonchunks.jsonchunks-high-confidence.jsonchunks-public-safe.jsonchunks-internal-tech.jsonevaluation-cases.jsonevaluation-report.json
Operational notes
- The first Qdrant prototype should ingest
chunks-high-confidence.jsonorchunks-internal-tech.json, not the full raw corpus. - Public-facing experiences should stay on
public_safefilters even after Qdrant is introduced. - After manuals-data changes, rebuild the artifacts so the new normalized corpus and evaluation report stay in sync.