VPC Private Deployment
Reference architecture for deploying AI systems within a private cloud VPC with no public internet exposure
Last updated: 5 February 2026
VPC Private Deployment
A reference architecture for deploying AI infrastructure within a private cloud Virtual Private Cloud (VPC), ensuring enterprise-grade security with no public internet exposure.
Overview
This architecture is designed for organizations that:
- Operate in regulated industries with strict data handling requirements
- Require network isolation for sensitive AI workloads
- Need audit trail compliance for all data access
- Want to leverage cloud infrastructure while maintaining data sovereignty
The key principle: AI services exist entirely within a private network segment, accessible only through approved internal routes.
Architecture Diagram
┌─────────────────────────────────────────────────────────────────┐
│ Cloud Provider VPC │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Public Subnet (Optional) │
│ ┌─────────────────┐ │
│ │ Bastion Host │ ← SSH only, no AI services │
│ └────────┬────────┘ │
│ │ VPN/Direct Connect │
│ ─────────┴───────────────────────────────────────────────── │
│ │
│ Private Subnet (AI Workloads) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Ollama │ │ Qdrant │ │ Application │ │
│ │ (Inference) │ │ (Vectors) │ │ (RAG + API) │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
│ ↑ ↑ ↑ │
│ └────────────────┴──────────────────────┘ │
│ │ │
│ ┌──────────────────────┴──────────────────────────────────┐ │
│ │ Internal Application Load Balancer │ │
│ └──────────────────────┬──────────────────────────────────┘ │
│ │ │
└──────────────────────────┼──────────────────────────────────────┘
│ VPN / Direct Connect
┌──────┴──────┐
│ Corporate │
│ Network │
└─────────────┘
Security Layers
Network Security
- Private subnets with no internet gateway
- Security groups restricting traffic to known internal sources
- Network ACLs as secondary defense layer
- VPC Flow Logs for network traffic auditing
Access Control
- IAM roles for service-to-service authentication
- VPN or Direct Connect for user access (no public endpoints)
- Bastion host for administrative access only
- MFA required for all human access
Data Protection
- Encryption at rest (AES-256) for all storage
- Encryption in transit (TLS 1.3) for all internal traffic
- No external API calls from AI services
- Data never leaves VPC boundary
Components
| Component | Purpose | Deployment |
|---|---|---|
| Ollama | LLM inference | GPU instances in private subnet |
| Qdrant | Vector database | Private instances with EBS storage |
| Application | RAG pipeline | Container service (ECS/EKS/GKE) |
| ALB (Internal) | Load balancing | Internal-only, no public listeners |
| Secrets Manager | Credential storage | VPC endpoint access |
| CloudWatch/Monitoring | Observability | VPC endpoint access |
Networking Requirements
VPC Configuration
VPC CIDR: 10.0.0.0/16
Private Subnet A: 10.0.1.0/24 (AI compute)
Private Subnet B: 10.0.2.0/24 (AI compute - AZ redundancy)
Private Subnet C: 10.0.3.0/24 (Database)
Route Table: No internet gateway
NAT Gateway: None (fully isolated) or restricted outbound if needed
Required VPC Endpoints
If using cloud provider services without internet access:
- S3 Gateway Endpoint - Model storage
- Secrets Manager Interface Endpoint - Credentials
- CloudWatch Interface Endpoint - Logging
- ECR Interface Endpoint - Container images (if using containers)
Compliance Mapping
| Requirement | How This Architecture Addresses It |
|---|---|
| GDPR Article 32 | Encryption at rest and in transit |
| SOC 2 CC6.1 | Network segmentation, access controls |
| HIPAA § 164.312 | Audit controls, encryption, access management |
| ISO 27001 A.13 | Network security controls, segregation |
| PCI DSS 1.3 | Network isolation, no direct internet access |
Cost Considerations
AWS Example (us-east-1)
| Resource | Specification | Monthly Cost |
|---|---|---|
| GPU Instance | g5.2xlarge (1x A10G) | ~$1,200 |
| Database Instance | r6g.large | ~$150 |
| Application | ECS Fargate (2 vCPU) | ~$100 |
| Storage | 500GB EBS | ~$50 |
| Data Transfer | Internal only | $0 |
| VPC Endpoints | 4 endpoints | ~$30 |
| Total | ~$1,530/month |
Azure / GCP
Comparable pricing with equivalent instance types. Contact us for cloud-specific sizing.
Implementation Checklist
- Design VPC with private subnets across multiple AZs
- Configure security groups and NACLs
- Set up VPN or Direct Connect to corporate network
- Deploy required VPC endpoints
- Provision GPU instances in private subnet
- Deploy Qdrant with encrypted EBS volumes
- Deploy application containers
- Configure internal load balancer
- Set up CloudWatch logging and alerts
- Enable VPC Flow Logs
- Configure backup and disaster recovery
- Conduct security review and penetration testing
- Document runbook and incident response procedures
Operational Considerations
Patching and Updates
Without internet access, updates require:
- Pull images/packages to S3 via separate process
- Access from private subnet via S3 VPC endpoint
- Or use scheduled maintenance windows with temporary NAT
Monitoring
All observability flows through VPC endpoints:
- CloudWatch for metrics and logs
- X-Ray for distributed tracing
- Custom dashboards for AI-specific metrics (latency, token usage)
Backup and Recovery
- Automated EBS snapshots for vector database
- Cross-region replication for disaster recovery (within compliance boundaries)
- Tested restore procedures
Next Steps
Ready to deploy AI infrastructure in your VPC?
Book a Technical Scoping Call to discuss your cloud provider, security requirements, and compliance constraints.