What can you do with it?

Transform your document collections into an intelligent, searchable knowledge base that understands context and meaning, not just keywords. Ask natural language questions and get precise AI-generated answers sourced from your documents, or dive deeper with raw search results that show exactly which documents contain relevant information with confidence scores. Beyond search, you can build and maintain your knowledge base by uploading documents, creating text files directly, and organizing content with metadata. Use smart filtering to narrow results by document names, custom attributes, or date ranges to find exactly what you need.

Appropriate use cases

Knowledge bases use vector stores designed for finding the most relevant context from large document collections, not for exhaustive searches. They excel at semantic retrieval—locating the most contextually relevant passages to answer natural-language questions. Think of it like this: Vector stores solve the “needle in haystack” problem but aren’t designed for “taking inventory of the entire warehouse.”

Best for (finding the needle):

  • Conversational AI & Agents: Answering specific, context-rich queries like “What are our expense policies for rideshares?” or “How much vacation time do new employees get?”
  • Semantic Search: Finding information by meaning rather than exact keywords, even when specific terms aren’t present
  • Contextual Grounding: Getting intelligent, ranked summaries of the most relevant evidence from your documents

Not ideal for (cataloging everything):

  • Exhaustive searches: Batch queries like “List every policy document that mentions transportation expenses” that need comprehensive results across all documents
  • Complete data extraction: Vector stores return 20-50 most relevant chunks, not all possible matches
  • Audit-style operations: When you need to verify that every document has been checked or processed
The system is optimized for smart, targeted retrieval that provides the best answers quickly, rather than comprehensive document listing. For exhaustive searches, combine with traditional filtering or use complementary search approaches.
⚠️ Important: After uploading files to a knowledge base, it may take a few minutes (sometimes but rarely, hours) for the content to be processed and become available for search. This processing time allows the system to extract text, analyze content, and index it for semantic search.

Search & Query Operations

How to search your knowledge base

Basic Command Structure

/knowledgebase-collection
search-query: [your question]

Parameters

Required:
  • search-query - Your question or search terms in natural language
Optional:
  • action - “ask” (default - AI answer) or “search” (raw document chunks)
  • filters - Filter results by document attributes
  • max-results - Number of results to return (1-50, default: 10)
  • score-threshold - Minimum relevance score (0.0-1.0). Higher = more relevant but fewer results
  • rewrite-query - Auto-rewrite query for better retrieval (true/false, default: false)
  • system-prompts - Instructions for AI response style (ask only). Examples: “Explain like I’m a 10 year old”, “Summarize entirely with bullet points”

Filter Format

Use simple expressions to filter search results:
filters: itemName = "TravelPolicy.pdf"
filters: area = "hr" AND content_type = "text/plain"
filters: date > "2024-01-01" OR filename = "handbook.pdf"
Operators: =, !=, >, >=, <, <=, AND, OR

Query Rewriting

The rewrite-query parameter controls whether your natural language query will be automatically rewritten before executing the vector search:
  • rewrite-query: true - Automatically rewrites your query to improve retrieval performance, which can restructure or clarify ambiguous questions for better relevance
  • rewrite-query: false - Your query is used exactly as provided (default)
Use query rewriting to improve recall for suboptimal or ambiguous queries. Disable it when you need full control and predictability over input, or for debugging and precise benchmarking.

Search Examples

Basic Question

/knowledgebase-collection
search-query: what are expense policies
Get an AI-generated answer about expense policies.

Raw Search Results (document chunks)

/knowledgebase-collection
search-query: customer data formats
action: search
Get raw search results with relevance scores.

Filter by Document

/knowledgebase-collection
search-query: what are expense policies
filters: itemName = "TravelPolicy.pdf"
Search within a specific document only.
/knowledgebase-collection
search-query: customer data formats
action: search
filters: area = "crm" AND content_type = "text/plain"
Get raw results filtered to CRM text documents.

Tuned Search Results

/knowledgebase-collection
search-query: API authentication methods
action: search
max-results: 20
score-threshold: 0.6
Get up to 20 highly relevant results with minimum 60% relevance score.

Query Rewriting for Better Results

/knowledgebase-collection
search-query: how do I auth
action: search
rewrite-query: true
Automatically improve ambiguous or incomplete queries for better retrieval.

Custom Response Style

/knowledgebase-collection
search-query: what are our expense policies
system-prompts: "Explain like I'm a 10 year old"
Get an AI answer with specific formatting or explanation style.

Search Response Format

The search commands return:
{
  "vectorStoreId": "vector store identifier",
  "response": "AI-generated answer (ask only)",
  "data": [{
    "file_id": "file identifier",
    "filename": "document name",
    "score": "relevance score",
    "content": [{"type": "text", "text": "content snippet"}]
  }]
}

File Management Operations

Managing documents in your knowledge base

File management in knowledge bases works exactly the same as filestorage - same upload methods, same parameters, same operations. The key difference is that files uploaded to knowledge bases are automatically processed and indexed for semantic search.

Quick Examples

Upload a file:
/knowledgebase-collection upload file with:
file: [file buffer or file input]
metadata: "Policy document"
Create a text document:
/knowledgebase-collection create document with:
filename: "policy-summary.txt"
content: "Company policies summary..."
List all files:
/knowledgebase-collection list files:
format: light
For complete file management documentation including all upload methods, parameters, and operations, see the Filestorage Guide. All those operations work identically with knowledge bases.

Response Format

Search Operations

The search commands return:
{
  "vectorStoreId": "vector store identifier",
  "response": "AI-generated answer (ask only)",
  "data": [{
    "file_id": "file identifier",
    "filename": "document name",
    "score": "relevance score",
    "content": [{"type": "text", "text": "content snippet"}]
  }]
}

File Operations

File creation/update:
{
  "message": "File created successfully",
  "id": "file-identifier",
  "file_url": "https://...",
  "file_size": 1024,
  "mime_type": "text/plain",
  "metadata": {
    "attr1": "value1",
    "batchId": "batch_2025-07-28_15-43-47",
    "originalFilename": "example.txt"
  },
  "collectionId": "collection-id"
}
File metadata (full details):
{
  "filename": "policy.txt",
  "filepath": "file-storage/collection-id/policy.txt",
  "type": "text/plain",
  "createdAt": "2024-11-17T18:04:22.197Z",
  "updatedAt": "2024-11-17T18:09:46.455Z",
  "metadata": {"description": "Policy document"},
  "isPublic": false,
  "signedUrl": "https://...",
  "accessList": [
    {
      "action": "create",
      "providerId": "user-id",
      "date": "2024-11-17T18:04:22.197Z"
    }
  ]
}

File Types and Content Extraction

Supported File Types

Knowledge bases support the following file types for upload and content extraction: Document files: PDF, MD, HTML, DOC/DOCX, PPT/PPTX, TXT Data files: JSON, YML/YAML, CSV, XLS/XLSX Code files: C/CPP, CS, GO, JAVA, JS/TS, PHP, PY, RB, SH, TEX

Content Extraction

All supported file types are automatically processed and their content extracted for semantic search in the knowledge base. This includes:
  • Document files: PDF, Word documents, PowerPoint presentations, Markdown, HTML, plain text
  • Data files: JSON, YAML, CSV, Excel spreadsheets
  • Code files: All programming languages are indexed with syntax awareness
The system extracts text content and makes it searchable through the RAG (Retrieval-Augmented Generation) system.

File Type Handling

Text-based files (TXT, MD, HTML, JSON, YAML, CSV, code files):
  • Content is directly indexed and searchable
  • Can be updated with create/append operations
  • Immediately available for semantic search
Document files (PDF, DOC/DOCX, PPT/PPTX, XLS/XLSX):
  • Content is extracted and processed for search indexing
  • May take a few moments to become fully searchable after upload
  • Use signedUrl from metadata to access original file

File Upload Methods

  1. Direct File Upload: Use when you have a file buffer or file input from user uploads
  2. Artifact Upload: Use when referencing files created in previous automation steps
  3. Content Creation: Use when you want to create documents programmatically with text content