Source Path Examples¶
This document provides detailed examples for configuring SOURCE_PATHS in Flexible GraphRAG across different operating systems and scenarios.
📁 Basic Syntax¶
The SOURCE_PATHS configuration accepts a JSON array of strings, where each string can be:
- A single file path
- A directory path (processes ALL files in the directory)
- Mixed files and directories
🪟 Windows Examples¶
Single File¶
# Using forward slashes (recommended)
SOURCE_PATHS=["C:/Documents/report.pdf"]
# Using backslashes (escape required)
SOURCE_PATHS=["C:\\Documents\\report.pdf"]
# Windows "Copy as path" format (paste directly!)
SOURCE_PATHS=["C:\Users\John Doe\Documents\My Report.pdf"]
Multiple Files¶
SOURCE_PATHS=["C:/file1.pdf", "D:/folder/file2.docx", "E:/data/file3.txt"]
# With spaces in paths
SOURCE_PATHS=["C:\1 sample files\cmispress.pdf", "D:\My Files\reports\annual.docx"]
Whole Directory¶
# Processes ALL files in the directory
SOURCE_PATHS=["C:/Documents/reports"]
# Multiple directories
SOURCE_PATHS=["C:/Documents/reports", "D:/Projects/data", "E:/Archive"]
Mixed Files and Directories¶
Relative Paths (Windows)¶
# Relative to the project root
SOURCE_PATHS=["./sample-docs", "./data/reports"]
SOURCE_PATHS=["..\\parent-folder\\documents"]
UNC Network Paths¶
# Using forward slashes
SOURCE_PATHS=["//server/share/folder"]
# Using backslashes
SOURCE_PATHS=["\\\\server\\share\\folder"]
🍎 macOS Examples¶
Single File¶
SOURCE_PATHS=["/Users/username/Documents/report.pdf"]
SOURCE_PATHS=["/Applications/MyApp/data/file.txt"]
Multiple Files¶
Whole Directory¶
Relative Paths (macOS)¶
SOURCE_PATHS=["./sample-docs", "../shared-data"]
SOURCE_PATHS=["~/Documents/my-files"] # Home directory shortcut
🐧 Linux Examples¶
Single File¶
Multiple Files¶
Whole Directory¶
Relative Paths (Linux)¶
SOURCE_PATHS=["./local-data", "../shared-folder"]
SOURCE_PATHS=["~/documents/work"] # Home directory shortcut
⚠️ Important Notes¶
Directory Processing Warning¶
When you specify a directory path, the system will process ALL files in that directory:
If you only want specific files, list them individually:
# This processes only these specific files
SOURCE_PATHS=["C:/Documents/reports/q1.pdf", "C:/Documents/reports/q2.pdf"]
File Types Supported¶
The system supports these file formats:
- Documents: PDF, DOCX, PPTX, TXT, MD
- Spreadsheets: XLSX, CSV
- Web: HTML
- Images: PNG, JPG (with OCR)
- Archive: ASCIIDOC
Path Encoding¶
- Windows: Use forward slashes
/or double backslashes\\ - Spaces: Paths with spaces are supported, no escaping needed in JSON array
- Special characters: Most Unicode characters are supported
🖥️ UI Client Differences¶
Different UI clients use different environment variable names:
Backend (FastAPI)¶
Angular Frontend¶
React/Vue Frontends¶
Note: Frontend environment variables typically expect a single directory path, while the backend SOURCE_PATHS accepts multiple files and directories.
🗄️ Repository Path Examples (CMIS/Alfresco)¶
CMIS Repository Paths¶
CMIS uses standard CMIS path format:
# CMIS paths start with /
CMIS_FOLDER_PATH=/Shared/Documents
CMIS_FOLDER_PATH=/Sites/my-site/documentLibrary/folder
Alfresco Repository Paths¶
NEW (python-alfresco-api 1.1.5+): Alfresco now uses native Alfresco paths with flexible format:
# Short format (recommended - matches what you see in Alfresco Share)
ALFRESCO_PATH=/Shared/GraphRAG
ALFRESCO_PATH=/Sites/my-site/documentLibrary/Reports
ALFRESCO_PATH=/User Homes/admin/My Files
# Full format (also works - system automatically strips /Company Home prefix)
ALFRESCO_PATH=/Company Home/Shared/GraphRAG
ALFRESCO_PATH=/Company Home/Sites/my-site/documentLibrary/Reports
Both formats work! The system automatically strips /Company Home prefix if present, since the root node already represents Company Home.
Benefits of Native Alfresco Paths:
- ✅ More intuitive - matches what you see in Alfresco Share UI
- ✅ Consistent with Alfresco Content Services API
- ✅ Works with relative_path feature for better performance
- ✅ Flexible - use short format (/Shared) or full format (/Company Home/Shared)
- ✅ Backward compatible - both formats supported
Path Examples:
# Shared folder (short format - recommended)
/Shared/GraphRAG/documents
# Sites (short format - recommended)
/Sites/engineering/documentLibrary/specs
# User Homes (short format - recommended)
/User Homes/admin/My Files
# Data Dictionary (short format - recommended)
/Data Dictionary/Scripts
# Full format also works (optional)
/Company Home/Shared/GraphRAG/documents
/Company Home/Sites/engineering/documentLibrary/specs
💡 Best Practices¶
- Use relative paths when possible for portability
- Start small - test with one file before processing large directories
- Use forward slashes on Windows for consistency
- Check file permissions ensure the application can read the specified paths
- Avoid system directories stick to user documents and data folders
- Test paths verify paths exist and are accessible before running
- Alfresco paths - Use native
/Company Home/...format for clarity (python-alfresco-api 1.1.5+)