Rocky_Mountain_Vending/docs/operations/MANUALS_QDRANT_READINESS.md

1.8 KiB

Manuals Qdrant Readiness

Purpose

  • The long-term source of truth for this pipeline is now the shared manuals-platform package at the workspace root.
  • The RMV repo keeps this document as a consumer-side reference for the tenant-filtered artifacts Rocky reads.

Source inputs

  • Shared package location: ../manuals-platform
  • Shared build outputs: ../manuals-platform/output/full/*
  • Rocky tenant outputs: ../manuals-platform/output/tenants/rocky-mountain-vending/*

What the corpus builder does

  • The shared package scans the full portfolio manual set, classifies every PDF, assigns tenant entitlements, and publishes tenant-filtered Qdrant-ready artifacts.
  • It keeps public_safe and internal_tech retrieval profiles on top of one central corpus.
  • Rocky consumes the prebuilt Rocky tenant export instead of rebuilding from raw manuals data inside the app.

Build and evaluation commands

  • Build artifacts:
    • pnpm manuals:qdrant:build
  • Build artifacts into a custom directory:
    • pnpm manuals:qdrant:build -- --output-dir /absolute/path
  • Run the evaluation set:
    • pnpm manuals:qdrant:eval

Artifact output

  • Default output directory: output/manuals-qdrant
  • Important files:
    • summary.json
    • manuals.json
    • chunks.json
    • chunks-high-confidence.json
    • chunks-public-safe.json
    • chunks-internal-tech.json
    • evaluation-cases.json
    • evaluation-report.json

Operational notes

  • The first Qdrant prototype should ingest chunks-high-confidence.json or chunks-internal-tech.json, not the full raw corpus.
  • Public-facing experiences should stay on public_safe filters even after Qdrant is introduced.
  • After manuals-data changes, rebuild the artifacts so the new normalized corpus and evaluation report stay in sync.