AWS VPC Networking — Deep Dive for CloudOps
VPC networking deep dive for operations engineers. Covers subnets, NAT, NACLs, VPC endpoints, VPN, Direct Connect, Transit Gateway, and DNS — all heavy SOA-C03 exam topics.
What you'll learn
- Design VPC CIDR blocks and subnet layouts
- Configure internet and NAT gateways for public/private access
- Differentiate NACLs vs Security Groups
- Use VPC endpoints to access AWS services without internet traffic
- Connect on-premises networks via VPN and Direct Connect
- Configure Route 53 routing policies and health checks
Prerequisites
Relevant for certifications
VPC Fundamentals
A Virtual Private Cloud (VPC) is your logically isolated network within AWS. You define the IP address range, subnets, routing, and security controls.
Region
└── VPC (10.0.0.0/16)
├── Public Subnet (10.0.1.0/24) — has route to Internet Gateway
│ ├── EC2 with public IP
│ └── NAT Gateway
└── Private Subnet (10.0.2.0/24) — no direct internet route
├── EC2 instances (databases, apps)
└── Route to NAT Gateway for outbound internet
CIDR notation
10.0.0.0/16 → 65,536 IPs (10.0.0.0 – 10.0.255.255)
10.0.0.0/24 → 256 IPs (10.0.0.0 – 10.0.0.255)
10.0.0.0/28 → 16 IPs (10.0.0.0 – 10.0.0.15)
AWS reserves 5 IPs per subnet (.0 network, .1 VPC router, .2 DNS, .3 reserved, .255 broadcast).
Default VPC
Every AWS account has a default VPC (172.31.0.0/16) with:
- An Internet Gateway attached
- A public subnet in each AZ
- Default route table pointing to the IGW
Warning
Never use the default VPC for production workloads. Create a dedicated VPC with proper private/public subnet layout.
Subnets and Routing
Public vs Private subnets
| Public Subnet | Private Subnet | |
|---|---|---|
| Route to IGW | Yes (0.0.0.0/0 → IGW) | No |
| Instance gets public IP | Yes (if auto-assign enabled) | No |
| Internet accessible | Yes | No (outbound via NAT only) |
Route tables
Each subnet has an associated route table:
Public subnet route table:
10.0.0.0/16 → local (VPC traffic stays inside)
0.0.0.0/0 → igw-xxx (internet traffic → Internet Gateway)
Private subnet route table:
10.0.0.0/16 → local
0.0.0.0/0 → nat-xxx (outbound internet → NAT Gateway)
Internet Gateway & NAT Gateway
Internet Gateway (IGW)
- Horizontally scaled, redundant — no bandwidth limit
- Enables two-way internet communication for instances with public IPs
- One IGW per VPC
NAT Gateway
NAT Gateway allows private subnet instances to initiate outbound connections to the internet (software updates, API calls) while blocking inbound connections.
Private instance → NAT Gateway (in public subnet) → IGW → Internet
| NAT Gateway | NAT Instance | |
|---|---|---|
| Managed by | AWS | You |
| Availability | Highly available within AZ | Single instance |
| Bandwidth | Up to 100 Gbps | Instance type limited |
| Security groups | Not applicable | Yes |
| Exam recommendation | Always prefer | Legacy option |
# Create NAT Gateway in public subnet
aws ec2 create-nat-gateway \
--subnet-id subnet-public-12345 \
--allocation-id eipalloc-12345 # Elastic IP
# Add NAT Gateway to private subnet route table
aws ec2 create-route \
--route-table-id rtb-private-12345 \
--destination-cidr-block 0.0.0.0/0 \
--nat-gateway-id nat-12345
AZ resilience for NAT
NAT Gateways are AZ-specific. For HA, create one NAT Gateway per AZ and update each AZ's private subnet route table to use its local NAT Gateway.
Security Groups vs NACLs
| Security Group | NACL | |
|---|---|---|
| Level | Instance / ENI | Subnet |
| Type | Stateful | Stateless |
| Rules | Allow only | Allow + Deny |
| Evaluation | All rules evaluated | Rules evaluated in number order (lowest first) |
| Default | Deny all inbound, allow all outbound | Allow all (default NACL) |
NACL rule evaluation (stateless = key exam point)
Because NACLs are stateless, you must explicitly allow both inbound AND outbound for each connection:
Allow HTTPS inbound (443):
Inbound rule: ALLOW TCP 443 from 0.0.0.0/0
Outbound rule: ALLOW TCP 1024-65535 to 0.0.0.0/0
(ephemeral ports for response traffic)
NACL best practices
Rule 100: Allow HTTPS (443) inbound from anywhere
Rule 110: Allow HTTP (80) inbound from anywhere
Rule 120: Allow ephemeral ports (1024-65535) inbound (return traffic)
Rule 200: Deny specific IPs (e.g., known bad actors)
Rule *: Deny all (implicit)
VPC Peering
VPC Peering creates a private network route between two VPCs (same or different accounts/regions).
VPC A (10.0.0.0/16) ↔ VPC B (10.1.0.0/16)
peering connection
Route table in VPC A: 10.1.0.0/16 → pcx-xxx
Route table in VPC B: 10.0.0.0/16 → pcx-xxx
Constraints
- No transitive peering — if A↔B and B↔C, A cannot reach C through B
- No overlapping CIDR blocks allowed
- Route tables must be manually updated in both VPCs
VPC Endpoints
VPC Endpoints allow you to connect to AWS services (S3, DynamoDB, SSM, etc.) without routing traffic through the internet.
Types
| Type | Services | Cost |
|---|---|---|
| Gateway endpoint | S3, DynamoDB | Free |
| Interface endpoint (PrivateLink) | Most AWS services (SSM, ECR, CloudWatch, etc.) | $0.01/hour + data |
Gateway endpoint (S3)
# Create S3 gateway endpoint
aws ec2 create-vpc-endpoint \
--vpc-id vpc-12345 \
--service-name com.amazonaws.us-east-1.s3 \
--route-table-ids rtb-private-12345
# S3 access from private subnet now stays in AWS network
# Route table automatically updated: pl-68a54001 → vpce-xxx
Interface endpoint (SSM)
For SSM Session Manager in private subnets (no NAT Gateway), you need 3 interface endpoints:
# Create SSM interface endpoints
for service in ssm ssmmessages ec2messages; do
aws ec2 create-vpc-endpoint \
--vpc-id vpc-12345 \
--vpc-endpoint-type Interface \
--service-name com.amazonaws.us-east-1.$service \
--subnet-ids subnet-private-12345 \
--security-group-ids sg-ssm-endpoints
done
VPC Flow Logs
VPC Flow Logs capture IP traffic information for VPC, subnets, or individual ENIs.
# Flow log format
version account-id interface-id srcaddr dstaddr srcport dstport protocol packets bytes start end action log-status
# Example log entry:
2 123456789 eni-abc123 10.0.0.5 10.0.1.10 54321 443 6 10 2000 1682000 1682005 ACCEPT OK
2 123456789 eni-abc123 203.0.113.0 10.0.0.5 80 80 6 1 40 1682010 1682015 REJECT OK
Deliver flow logs to
- CloudWatch Logs — near real-time query with Logs Insights
- S3 — long-term retention, query with Athena
Common troubleshooting queries
-- Find rejected traffic to an instance (Athena)
SELECT srcaddr, dstport, COUNT(*) as count
FROM vpc_flow_logs
WHERE action = 'REJECT'
AND dstaddr = '10.0.1.10'
GROUP BY srcaddr, dstport
ORDER BY count DESC;
VPC Reachability Analyzer
A network diagnostics tool that analyses reachability between two endpoints — without sending actual traffic:
aws ec2 create-network-insights-path \
--source eni-abc123 \
--destination eni-xyz789 \
--protocol TCP \
--destination-port 443
aws ec2 start-network-insights-analysis \
--network-insights-path-id nip-12345
Returns: whether traffic can flow + which component is blocking it (security group, route table, NACL).
Site-to-Site VPN
Connect your on-premises data centre to AWS over the internet (encrypted IPsec tunnel).
On-premises AWS
Customer Gateway (CGW) <----VPN----> Virtual Private Gateway (VGW)
(your VPN device) (AWS-managed)
# Create Virtual Private Gateway
aws ec2 create-vpn-gateway \
--type ipsec.1 \
--amazon-side-asn 64512
# Create Customer Gateway
aws ec2 create-customer-gateway \
--type ipsec.1 \
--public-ip 203.0.113.0 \ # Your on-prem router public IP
--bgp-asn 65000
# Create VPN Connection
aws ec2 create-vpn-connection \
--type ipsec.1 \
--customer-gateway-id cgw-12345 \
--vpn-gateway-id vgw-12345
VPN as backup to Direct Connect
Normal: On-prem → Direct Connect → AWS
Failover: On-prem → Site-to-Site VPN → AWS (lower bandwidth, encrypted)
AWS Direct Connect
Direct Connect provides a dedicated, private network connection from your data centre to AWS — bypassing the internet entirely.
| Site-to-Site VPN | Direct Connect | |
|---|---|---|
| Connection | Over internet | Dedicated private line |
| Speed | Up to 1.25 Gbps | 1 Gbps, 10 Gbps, 100 Gbps |
| Latency | Variable | Consistent, low |
| Encryption | Yes (IPsec) | No (add VPN on top for encryption) |
| Setup time | Minutes | Weeks to months |
| Cost | Low | Higher |
Direct Connect Gateway
Use a Direct Connect Gateway to connect one DX connection to multiple VPCs across regions.
AWS Transit Gateway
Transit Gateway is a hub-and-spoke network transit centre — connects VPCs, VPNs, and Direct Connect without a mesh of peering connections.
Without TGW (VPC peering mesh for 5 VPCs): 10 peering connections
With TGW: 5 attachments to one hub — all VPCs can communicate
VPC A ─┐
VPC B ─┤
VPC C ─┼─ Transit Gateway ─── VPN (on-premises)
VPC D ─┤ └── Direct Connect
VPC E ─┘
Key TGW features
- Route tables — control which VPCs can communicate (segmentation)
- Multicast — supported natively
- Inter-region peering — connect TGWs across regions
- RAM sharing — share TGW across accounts in an Organisation
VPC Block Public Access (BPA)
A newer account-level guardrail that blocks all public internet traffic to or from a VPC — even if an IGW exists or a resource has a public IP:
aws ec2 modify-vpc-block-public-access-options \
--internet-gateway-exclusion-mode block-bidirectional \
--region us-east-1
Useful for highly secure environments where no public internet access should be possible.
Route 53 for CloudOps
Routing Policies
| Policy | Use Case |
|---|---|
| Simple | Single resource, no health checks |
| Weighted | A/B testing, traffic splitting (70/30) |
| Latency | Route to lowest-latency region |
| Failover | Primary/secondary with health checks |
| Geolocation | Route based on user's country/continent |
| Geoproximity | Route based on distance with bias adjustment |
| Multi-value | Return up to 8 healthy records (not a load balancer) |
| IP-based | Route based on client IP CIDR |
Route 53 Health Checks
Health checks monitor endpoints and can trigger DNS failover:
aws route53 create-health-check \
--caller-reference "hc-$(date +%s)" \
--health-check-config '{
"IPAddress": "203.0.113.0",
"Port": 443,
"Type": "HTTPS",
"ResourcePath": "/health",
"FullyQualifiedDomainName": "api.myapp.com",
"RequestInterval": 30,
"FailureThreshold": 3
}'
Route 53 Resolver (Hybrid DNS)
Route 53 Resolver enables DNS resolution between AWS and on-premises:
On-premises DNS → Inbound Resolver Endpoint → resolve AWS private hosted zones
AWS instances → Outbound Resolver Endpoint → query on-premises DNS servers
Route 53 Application Recovery Controller (ARC)
Advanced multi-region failover with routing controls — allows you to shift traffic between regions with a single API call (or even manually during DR drills).
Hands-on: Private VPC with SSM Access (No Bastion)
Goal: Fully private EC2 instances accessible only via SSM Session Manager
1. Create VPC: 10.0.0.0/16
2. Create private subnet: 10.0.1.0/24 (no route to IGW)
3. Create 3 VPC Interface Endpoints:
- com.amazonaws.<region>.ssm
- com.amazonaws.<region>.ssmmessages
- com.amazonaws.<region>.ec2messages
(all in private subnet, sg-ssm-endpoints: allow HTTPS 443 from VPC CIDR)
4. Launch EC2 in private subnet:
- No key pair (not needed)
- IAM role: AmazonSSMManagedInstanceCore
- Security group: allow outbound 443 to sg-ssm-endpoints
5. Verify instance appears in SSM Fleet Manager as "Online"
6. Connect: SSM → Session Manager → Start Session
→ Full bash/PowerShell access, no port 22, no public IP
Common SOA-C03 Exam Questions
Q: Why are there only 251 usable IPs in a /24 subnet? AWS reserves 5 IPs per subnet: .0 (network address), .1 (VPC router), .2 (DNS), .3 (reserved for future use), .255 (broadcast). 256 - 5 = 251.
Q: Traffic from a private subnet isn't reaching the internet. What do you check?
- Does the private subnet's route table have a route
0.0.0.0/0 → NAT Gateway? - Is the NAT Gateway in a public subnet?
- Does the public subnet have
0.0.0.0/0 → Internet Gateway? - Does the NACL allow outbound traffic and the return ephemeral ports inbound?
Q: Two VPCs are peered but instances can't communicate. What's missing? Route tables. VPC peering creates the connection but you must manually add routes in both VPCs: VPC A's route table → destination VPC B CIDR → pcx-xxx, and vice versa.
Q: How do you access S3 from a private subnet without internet traffic? Create an S3 gateway endpoint in the VPC. The route table is automatically updated to route S3 traffic through the endpoint. No NAT Gateway required, and traffic never leaves the AWS network.
What to Learn Next
- AWS Security & Compliance — WAF, Network Firewall, Shield for protecting your VPC
- AWS CloudWatch Monitoring — VPC Flow Logs analysis with CloudWatch Logs Insights and Athena
- AWS Systems Manager — configure VPC endpoints to enable SSM in private subnets
