Google Cloud Spanner Graph Setup¶
Prerequisites¶
- Google Cloud project with Cloud Spanner API enabled
- Spanner instance created in the GCP Console
- Spanner database created inside the instance
- Python package:
llama-index-spanner(included in.[spanner-extras]) google-cloud-spanner(included in.[spanner-extras])
Note: The Spanner emulator only supports SQL — it does not support Spanner Graph (property graph queries). Use a real Spanner instance.
Install¶
uv pip install --python venv-3.13/Scripts/python.exe \
-e ".[langchain,langchain-extras,spanner-extras]" \
--override extras-overrides.txt
uv pip uninstall --python venv-3.13/Scripts/python.exe llama-index
The llama-index meta-package is pulled in by llama-index-spanner and downgrades other
packages. Always uninstall it after installing spanner-extras.
Create a Spanner Instance and Database¶
In the GCP Console:
- Cloud Spanner → Create Instance
- Choose instance ID, configuration (regional or multi-region), and compute capacity
- Create Database inside the instance
- Database dialect: Google Standard SQL (not PostgreSQL dialect)
- Leave DDL statements blank — the adapter creates the schema automatically on first ingest
Schema Auto-Creation¶
SpannerPropertyGraphStore (from llama-index-spanner) creates all Spanner tables and the
property graph definition automatically on the first upsert_nodes call during ingest.
Do not create the tables or CREATE PROPERTY GRAPH DDL manually. The library manages
creation order:
CREATE TABLE {graph_name}_NODE (id STRING, label STRING, properties JSON, ...) PRIMARY KEY (id)CREATE TABLE {graph_name}_EDGE (id STRING, dest_id STRING, ...) REFERENCES {graph_name}_NODECREATE PROPERTY GRAPH {graph_name} NODE TABLES ({graph_name}_NODE ...) EDGE TABLES ({graph_name}_EDGE ...)
For the default graph_name=knowledge_graph these are knowledge_graph_NODE and
knowledge_graph_EDGE. The DYNAMIC LABEL / DYNAMIC PROPERTIES clauses allow all entity and
relation types to share the two base tables (schemaless mode).
IAM Permissions¶
The service account or user needs:
| Role | Purpose |
|---|---|
roles/spanner.databaseUser |
Read/write data (sessions, queries, mutations) |
roles/spanner.databaseAdmin |
Create tables and property graph DDL on first ingest |
Grant via GCP Console (IAM & Admin → IAM → your service account → Edit → Add role)
or via gcloud:
# Data user role (read/write):
gcloud spanner databases add-iam-policy-binding <database-id> \
--instance=<instance-id> \
--project=<project-id> \
--member="serviceAccount:<sa-email>" \
--role="roles/spanner.databaseUser"
# Admin role (DDL — needed only on first ingest):
gcloud spanner databases add-iam-policy-binding <database-id> \
--instance=<instance-id> \
--project=<project-id> \
--member="serviceAccount:<sa-email>" \
--role="roles/spanner.databaseAdmin"
The service account email is the client_email field in your service account JSON key file.
Authentication¶
Priority order:
credentials_fileinSPANNER_GRAPH_DB_CONFIG— path to a service account JSON key fileGOOGLE_APPLICATION_CREDENTIALSenvironment variableflexible-graphrag/gcs.json— auto-detected if the file exists next to the package root- Application Default Credentials (
gcloud auth application-default loginor GCE metadata)
Configuration¶
PG_GRAPH_DB=spanner
# Service account JSON key file:
SPANNER_GRAPH_DB_CONFIG={"project_id": "my-gcp-project", "instance_id": "my-instance", "database_id": "my-database", "graph_name": "knowledge_graph", "credentials_file": "./gcs.json"}
# Application Default Credentials (gcloud auth):
SPANNER_GRAPH_DB_CONFIG={"project_id": "my-gcp-project", "instance_id": "my-instance", "database_id": "my-database", "graph_name": "knowledge_graph"}
Config Keys¶
| Key | Required | Description |
|---|---|---|
project_id |
Yes | GCP project ID |
instance_id |
Yes | Spanner instance ID |
database_id |
Yes | Spanner database ID |
graph_name |
No | Property graph name (default: knowledge_graph) |
credentials_file |
No | Path to service account JSON key file |
use_flexible_schema |
No | true (default) — {graph_name}_NODE / {graph_name}_EDGE tables with JSON properties (schemaless); false — one table per entity type |
Framework Support¶
Spanner is LI only — uses llama-index-spanner (SpannerPropertyGraphStore).
GRAPH_BACKEND=llamaindex is the only supported backend.
The langchain-google-spanner package requires langchain-core<1.0 which is incompatible
with langchain>=1.0 used by this project. LC support will be added if a compatible version
is released.
Cleanup¶
Cleanup deletes all rows from {graph_name}_EDGE then {graph_name}_NODE (foreign key order;
defaults to knowledge_graph_EDGE / knowledge_graph_NODE). If tables do not exist yet (no
ingest has run), the cleanup skips them silently. Requires spanner.databaseUser IAM role on
the database.