NebulaGraph LangChain Schema Guide¶
This guide covers the named tags and edge types used when running NebulaGraph with the LangChain backend (GRAPH_BACKEND=langchain).
Note: As of the v3.8.0 upgrade, all setup steps in this guide are automated by
_ensure_space_and_schema()in the adapter code (space creation,Props__tag,Relation__edge, all named entity tags and relationship edge types). You only need to run the steps below for manual verification or if you want to use a custom schema.
See NEBULA-SETUP.md for the base infrastructure setup (Docker, nebula-console ADD HOSTS, etc.).
What Is Automatic vs Manual¶
| Step | Automatic | Manual |
|---|---|---|
| Nebula space creation | Yes — _ensure_space_and_schema() |
Optional verification only |
Props__ tag, Relation__ edge type |
Yes — _ensure_space_and_schema() |
Optional verification only |
Missing columns on Props__ / Relation__ |
Yes — dynamic schema patch adds columns on write | — |
Named entity tags (Person, Organization, …) |
Yes — _ensure_space_and_schema() |
Optional — steps below for custom schema |
Named edge types (WORKS_FOR, …) |
Yes — _ensure_space_and_schema() |
Optional — steps below for custom schema |
Named tags and edge types are optional pre-creation. Without them, all vertices fall back to Props__ and all edges fall back to Relation__ with a label property — ingestion still works. The steps below are useful for verifying the schema or defining a custom one beyond what the adapter creates automatically.
How the LangChain Adapter Uses the Schema¶
Every ingested entity vertex receives two tags:
| Tag | Purpose |
|---|---|
Props__ |
Universal document metadata (doc_id, ref_doc_id, source, file_name, …) on every vertex — required for incremental updates |
Named entity tag (Person, Organization, …) |
Semantic properties extracted by the LLM (name, hire_date, salary, …) |
Every relationship edge uses a named edge type (WORKS_FOR, HAS_DEPARTMENT, …) when one was pre-created. If no named type matches the extracted relationship, the edge falls back to Relation__ with a label property.
The adapter discovers what exists at ingest time via SHOW TAGS / SHOW EDGES — no DDL is issued during ingestion, avoiding NebulaGraph's async schema-propagation delay.
Step 1: Create Entity-Type Tags¶
Run these in Nebula Studio after selecting the flexible_graphrag space from the dropdown.
CREATE TAG IF NOT EXISTS `Person`(
`id` STRING,
`name` STRING,
`hire_date` STRING,
`date_of_birth` STRING,
`salary` STRING,
`title` STRING
);
CREATE TAG IF NOT EXISTS `Organization`(
`id` STRING,
`name` STRING,
`industry` STRING,
`founded` STRING,
`description` STRING
);
CREATE TAG IF NOT EXISTS `Company`(
`id` STRING,
`name` STRING,
`industry` STRING,
`founded` STRING,
`description` STRING
);
CREATE TAG IF NOT EXISTS `Location`(
`id` STRING,
`name` STRING,
`address` STRING,
`city` STRING,
`country` STRING,
`latitude` STRING,
`longitude` STRING,
`capacity` STRING
);
CREATE TAG IF NOT EXISTS `Place`(
`id` STRING,
`name` STRING,
`address` STRING,
`city` STRING,
`country` STRING,
`latitude` STRING,
`longitude` STRING,
`capacity` STRING
);
CREATE TAG IF NOT EXISTS `Event`(
`id` STRING,
`name` STRING,
`start_date` STRING,
`end_date` STRING,
`date` STRING,
`description` STRING
);
CREATE TAG IF NOT EXISTS `Product`(
`id` STRING,
`name` STRING,
`description` STRING,
`category` STRING
);
CREATE TAG IF NOT EXISTS `Department`(
`id` STRING,
`name` STRING
);
CREATE TAG IF NOT EXISTS `Project`(
`id` STRING,
`name` STRING,
`status` STRING,
`description` STRING
);
CREATE TAG IF NOT EXISTS `Technology`(
`id` STRING,
`name` STRING,
`description` STRING
);
CREATE TAG IF NOT EXISTS `Skill`(
`id` STRING,
`name` STRING
);
CREATE TAG IF NOT EXISTS `Topic`(
`id` STRING,
`name` STRING,
`description` STRING
);
Verify:
Step 2: Create Named Relationship Edge Types¶
Named edge types allow the LLM to write clean nGQL queries:
MATCH (p:Person)-[:WORKS_FOR]->(c:Organization)
WHERE toLower(id(c)) CONTAINS toLower("acme")
RETURN id(p) AS name
Each edge type carries the four most useful document-tracking columns. Add more columns if needed (see Relation__ in NEBULA-SETUP.md for the full list).
CREATE EDGE IF NOT EXISTS `WORKS_FOR`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `HAS_DEPARTMENT`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `HAS_LOCATION`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `PART_OF`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `AFFILIATED_WITH`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `MANAGES`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `WORKS_IN_DEPARTMENT`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `ASSIGNED_TO`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `LED_BY`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `ATTENDED_BY`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `HOSTED_BY`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `HELD_AT`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `LOCATED_IN`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `BASED_IN`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `RELATED_TO`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `PRODUCED_BY`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `USES_TECHNOLOGY`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `HAS_SKILL`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `SPONSORS`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
CREATE EDGE IF NOT EXISTS `MENTIONED_IN`(`doc_id` STRING, `ref_doc_id` STRING, `source` STRING, `file_name` STRING);
Verify:
Note: If your ontology or LLM extraction introduces additional relationship types, add the corresponding CREATE EDGE IF NOT EXISTS statement before ingesting. Any relationship with no matching named edge type falls back to Relation__ automatically.
Step 3: Ingest¶
Run ingestion normally. The adapter calls SHOW TAGS and SHOW EDGES at the start of each ingest to discover what is available and routes each vertex/edge accordingly.
Querying¶
Once data is loaded, example nGQL queries for Nebula Studio:
-- All employees of Acme
MATCH (p:Person)-[:WORKS_FOR]->(c:Organization)
WHERE toLower(id(c)) CONTAINS toLower("acme")
RETURN id(p) AS name;
-- All departments at Acme
MATCH (d:Department)-[:PART_OF]->(c:Organization)
WHERE toLower(id(c)) CONTAINS toLower("acme")
RETURN id(d) AS department;
-- Events attended by a person
MATCH (p:Person)-[:ATTENDED_BY]->(e:Event)
WHERE toLower(id(p)) CONTAINS toLower("sarah")
RETURN id(e) AS event;
-- All edges (relationships)
MATCH ()-[e]->() RETURN * LIMIT 50;