Skip to content

MCP Tools Reference

The MCP server provides 9 tools for document ingestion, search, and AI Q&A.

Available Tools

Tool What it does
get_system_status() Verify setup and database connections
ingest_documents(data_source, skip_graph, ...) Process documents from any of the 13 data sources
ingest_text(content, source_name, skip_graph) Ingest and analyze specific text content; skip_graph=True skips KG extraction
search_documents(query, top_k) Hybrid search — find relevant document excerpts
query_documents(query, top_k) AI-powered Q&A over your document corpus
test_with_sample(skip_graph) Quick system verification with built-in sample content; skip_graph=True for vector-only
check_processing_status(id) Track long-running async ingestion tasks
get_python_info() Python environment diagnostics
health_check() Verify backend API connectivity

ingest_documents() — Arguments

ArgumentTypeDefaultDescription
data_sourcestr"filesystem"Source type — see Data Source JSON Config Strings table below
pathsstrNonefilesystem only — file/directory paths; JSON array ["p1","p2"], comma-separated, or single path
skip_graphboolfalseSkip KG extraction and graph store writes; chunk + embed + vector/search only
enable_syncboolfalseEnable automatic change detection and incremental updates for the source
<source>_configstrNoneJSON config string for non-filesystem sources (e.g. alfresco_config, s3_config) — see table below

Note

filesystem uses the paths argument, not a JSON config string. All other sources pass their connection details as a JSON string in the corresponding <source>_config argument.

Data Source JSON Config Strings

data_sourceConfig ArgumentJSON Fields
filesystempaths (not JSON)File/directory path(s) — no config string needed
alfrescoalfresco_config{"base_url": "...", "username": "...", "password": "...", "paths": [...], "nodeDetails": {...}}
cmiscmis_config{"cmis_url": "...", "username": "...", "password": "...", "paths": [...]}
s3s3_config{"bucket": "...", "aws_access_key_id": "...", "aws_secret_access_key": "...", "region": "..."}
azure_blobazure_blob_config{"connection_string": "...", "container_name": "..."}
gcsgcs_config{"bucket_name": "...", "credentials_path": "..."}
onedriveonedrive_config{"client_id": "...", "client_secret": "...", "tenant_id": "..."}
google_drivegoogle_drive_config{"credentials_path": "...", "folder_id": "..."}
sharepointsharepoint_config{"client_id": "...", "client_secret": "...", "tenant_id": "...", "site_url": "..."}
boxbox_config{"client_id": "...", "client_secret": "...", "folder_id": "..."}
webweb_config{"urls": ["https://...", "https://..."]}
wikipediawikipedia_config{"titles": ["Article Title", "..."]}
youtubeyoutube_config{"urls": ["https://youtube.com/watch?v=...", "..."]}

skip_graph — All Ingest Tools

skip_graph=True is available on all three ingest tools:

ingest_documents(data_source="filesystem", paths=["/docs"], skip_graph=True)
ingest_text(content="...", source_name="doc.txt", skip_graph=True)
test_with_sample(skip_graph=True)

When set, the document is chunked, embedded, and stored in vector + search indexes but KG extraction and property graph / RDF store writes are skipped. Useful for fast bulk ingest when graph queries are not needed.