Each product has a document library. Documents are ingested: text extraction → chunking → embedding → indexing. The autoreply bot + AI generate use RAG search to cite relevant content.

Access

Products → :id → Documents tab (the product-doc table).

Supported file formats

FormatNotes
.pdfPage-by-page text extract
.docxWord document
.md / .txtMarkdown / plain text
.htmlTags stripped, text kept
.csvOne chunk per row
.png / .jpgLLM vision describes the image as text

Upload

1

Click + Upload (add-source modal)

Drag-drop or pick a file.
2

Metadata

  • Title (auto from filename, editable)
  • Description
3

Submit

File uploads to storage. Document marked status pending.
4

Background ingest

  1. Download the file
  2. Extract text per format
  3. Chunk (300–500 tokens, 50-token overlap)
  4. Embed each chunk via the Embed slot
  5. Save into the search index
  6. Status → indexed

Document status

StatusMeaning
pendingJust uploaded, awaiting ingest
processingWorker running
indexedReady for RAG
failedIngest failed
archivedSoft hidden, excluded from RAG

RAG search for the bot

When the autoreply bot receives an inbound, the system runs RAG search on the relevant product’s KB → top-K chunks are injected into the system prompt → the bot can cite them. See Autoreply.
Document detail tabs (Preview / Chunks / Usage / Settings), per-product manual search box, per-document or bulk re-index button, document visibility internal-only vs customer-facing — coming soon.

Best practices

Short documents, one topic per file → higher chunk precision.
Stale documents → the bot cites wrong info. Re-upload + archive the old version.