The Problem: When RAG Alone Isn't Enough
I've deployed dozens of RAG systems for utilities over the past five years. The pattern is predictable: ingest operational documents, embed them in a vector store, retrieve context for LLM queries. Works beautifully for semantic search over maintenance manuals or incident reports.
Then reality hits. A grid operator asks: "Show me all outages in the northwest sector caused by equipment manufactured before 2015 that also affected customers with medical baseline status." Your pure vector system chokes. Why? Because this query requires traversing relationships—equipment to location to outage to customer—while also doing semantic matching on incident descriptions. Vector similarity doesn't model connections between entities. Graph traversal doesn't understand semantic meaning.
In energy operations, you constantly need both. Equipment genealogy trees. Causal chains from sensor anomaly to protection relay trip to line fault. Regulatory lineage connecting a NERC standard to your compliance procedures to specific protection schemes. These are graph problems. But you also need semantic search over unstructured operator logs, engineering studies, and vendor documentation. That's vector territory.
The solution isn't choosing one over the other. It's architectural integration. If you're evaluating whether your organisation is ready for this level of infrastructure complexity, the AI Readiness Assessment can help clarify your maturity baseline.
The Hybrid Vector-Graph Pattern
Core Architecture
The pattern uses two specialized databases working in concert:
- Vector store (Qdrant): Handles semantic similarity, embedding-based retrieval, and unstructured content
- Graph database (Neo4j): Models entities, relationships, hierarchies, and traversable knowledge structures
- Orchestration layer: Coordinates queries across both systems and merges results
At EthosPower, we standardized on Qdrant for vectors and Neo4j for graphs after extensive benchmarking. Qdrant's HNSW implementation consistently delivers sub-10ms p99 latency on 10M+ vector collections in our air-gapped OT deployments. Neo4j's Cypher query language maps naturally to how energy engineers think about system topology.
Data Flow
When you ingest an operational document—say, a root cause analysis report for a transformer failure:
- Parse and extract: Pull out entities (equipment IDs, timestamps, personnel, locations), relationships (caused_by, preceded_by, affected), and unstructured narrative text
- Dual write:
- Cross-reference: Store Qdrant point IDs as properties on Neo4j nodes; store Neo4j node IDs in Qdrant metadata
This bidirectional linkage is critical. It lets you start with either database and pivot to the other.
Query Patterns
Three primary patterns emerge:
Pattern 1: Graph-first with vector enrichment
User asks: "What do we know about failures on equipment downstream of substation XYZ?"
- Query Neo4j to traverse topology from XYZ outward
- Extract equipment IDs from graph results
- Query Qdrant with equipment IDs as filters plus semantic search on "failure" concepts
- Merge: graph provides structure, vectors provide semantic detail
Pattern 2: Vector-first with graph context
User asks: "Find incidents similar to this description: 'voltage sag preceded by harmonic distortion'"
- Embed query, search Qdrant for semantically similar incident reports
- Extract entity IDs from top results
- Query Neo4j to expand context: what equipment was involved, what else failed around the same time, who responded
- Merge: vectors find relevant content, graph expands situational context
Pattern 3: Hybrid traversal
User asks: "Show protection coordination for all relays protecting equipment similar to this failed transformer"
- Embed transformer failure characteristics, query Qdrant for similar equipment
- Feed equipment IDs into Neo4j to traverse protection relay relationships
- Return relay settings documents from Qdrant based on relay IDs
- This is the most complex pattern—requires tight orchestration
Implementation Considerations
Entity Resolution
Your biggest headache will be entity disambiguation. Is "T-1401" the same as "TR1401" or "Transformer 1401"? Energy utilities have decades of inconsistent naming conventions across SCADA, GIS, EAM, and document management systems.
I solve this with a canonical entity registry in Neo4j. Every equipment asset gets a UUID. All name variants become properties or aliases. When ingesting documents, run entity extraction through the registry to normalize before writing to either database. This adds 50-100ms per document but prevents query chaos later.
Embedding Strategy
For technical energy documents, I use domain-specific embedding models over general-purpose ones. Specifically:
- Operational logs and incident reports: Fine-tune
BAAI/bge-large-en-v1.5on your utility's incident corpus - Engineering standards and specs: Use
sentence-transformers/all-mpnet-base-v2with no fine-tuning—standards language is already formal and consistent - Conversational queries: Map to the same model you're using for documents to ensure embedding space alignment
In Qdrant, I configure separate collections per document type with different distance metrics. Operational logs use Cosine (handles varying text lengths). Equipment specs use Dot Product (faster, works well with normalized embeddings).
Graph Schema Design
Keep your Neo4j schema shallow—three to four levels of hierarchy maximum. Energy topology is already complex; don't make the graph mirror every nuance. Focus on relationships that answer real questions:
- Equipment hierarchy:
(:Substation)-[:CONTAINS]->(:Transformer)-[:FEEDS]->(:Feeder) - Temporal causality:
(:Event)-[:CAUSED]->(:Event)-[:PRECEDED]->(:Event) - Organizational ownership:
(:Team)-[:MAINTAINS]->(:Equipment) - Document lineage:
(:Standard)-[:REQUIRES]->(:Procedure)-[:IMPLEMENTS]->(:Configuration)
Avoid the temptation to model everything. I've seen graphs with 47 relationship types. Nobody can write queries against that. Aim for 8-12 core relationships.
Orchestration Layer
Use AnythingLLM for the orchestration if you're building a conversational interface. It handles the LLM interaction and can call custom functions to query both databases. For workflow automation, n8n provides better conditional branching when you need complex query decision trees.
The orchestrator's job:
- Parse user intent to determine query strategy (graph-first, vector-first, hybrid)
- Execute queries in sequence or parallel depending on dependencies
- Merge results with awareness of cross-references
- Format context for LLM synthesis or return structured data to UI
Real-World Trade-offs
Latency
Pure vector search: 5-15ms p99 on our typical datasets. Pure graph traversal: 10-30ms for queries three hops deep. Hybrid queries: 40-120ms depending on orchestration strategy. Sequential queries (graph then vector) are slower but simpler to implement. Parallel queries with merge logic are faster but harder to debug.
In SCADA environments where operators expect sub-second response, this matters. I typically pre-compute and cache common hybrid queries ("equipment health by substation") and only run dynamic hybrid queries for ad-hoc investigations.
Data Consistency
Dual writes mean dual failure modes. If Qdrant succeeds but Neo4j fails, you have orphaned vectors. If Neo4j succeeds but Qdrant fails, you have entities with no semantic context.
I handle this with a reconciliation worker that runs every 30 minutes, scanning for mismatches and retrying failed writes. In OT environments, I also maintain a write-ahead log in PostgreSQL so we can rebuild either database from source if corruption occurs. This adds operational complexity but is non-negotiable for NERC CIP environments.
Storage Overhead
You're storing data twice—once as vectors, once as graph structures. For a utility with 500GB of technical documents, expect:
- Qdrant: ~180GB (embeddings + metadata)
- Neo4j: ~50GB (entities and relationships only, not full text)
- Source documents: 500GB
Total: 730GB vs 500GB for documents alone. That's a 46% overhead. In practice, the query performance and analytical capability justify the cost, but budget accordingly. If you're calculating total infrastructure cost including hardware, the AI Implementation Cost Calculator accounts for storage multiplication.
Operational Complexity
You're now running two database systems instead of one. Each needs:
- Backup and restore procedures
- Monitoring and alerting
- Capacity planning
- Version upgrades
- Security hardening
In air-gapped OT networks, this means twice the deployment coordination. I mitigate by containerizing both with Docker Compose on a single compute node for smaller deployments (under 5M vectors, under 10M graph nodes). Larger deployments need dedicated infrastructure, which means dedicated ops knowledge.
When to Use This Pattern
Deploy hybrid vector-graph when:
- Your domain has rich relationship structures (equipment topology, process workflows, regulatory hierarchies)
- You need to answer questions that combine semantic search with graph traversal
- You're building conversational AI that must understand both concepts and connections
- You have engineers who can invest in entity resolution and schema design
Don't deploy hybrid vector-graph when:
- Your use case is pure document search with no entity relationships
- Your data model is flat (just attributes, no meaningful connections)
- You lack the operational maturity to run multiple database systems
- Latency requirements are under 20ms p99 (stick with single-database queries)
The Verdict
The hybrid vector-graph pattern is the right architecture for energy AI systems where operational context matters. Pure vector search can't answer topology questions. Pure graph databases can't do semantic matching. You need both, and the integration complexity is worth it.
At EthosPower, 70% of our utility AI deployments now use this pattern. The exceptions are pure document search use cases (vector only) or pure network analysis (graph only). For anything involving incident investigation, compliance mapping, or equipment intelligence, hybrid is the baseline.
Start with entity resolution. If you can't normalize your equipment identifiers across systems, the pattern won't deliver value. Get that right, then deploy Qdrant and Neo4j in parallel with a simple orchestration layer. Build query patterns incrementally—graph-first queries are easiest to implement and debug.
Try the EthosAI Chat to discuss how this pattern applies to your specific operational environment and data topology.