AWS Systems Manager (SSM) — Managing EC2 at Scale
Manage fleets of EC2 instances at scale. Covers Run Command, Session Manager, Patch Manager, Parameter Store, and automation — all key SOA-C03 topics.
What you'll learn
- Understand what SSM is and how the SSM Agent enables management without SSH
- Use Run Command to execute scripts across a fleet of instances
- Manage secrets and config using Parameter Store
- Patch instances at scale using Patch Manager and Maintenance Windows
- Establish secure shell-less access via Session Manager
- Automate operational tasks using SSM Automation
Prerequisites
Relevant for certifications
What is AWS Systems Manager?
AWS Systems Manager (SSM) is a unified operations platform for managing AWS resources at scale. It provides a secure, agent-based channel to your EC2 instances — without opening SSH ports or maintaining bastion hosts.
Why SSM matters for CloudOps
SSM is the backbone of AWS operational automation. The SOA-C03 exam tests it heavily. Know every sub-service.
The SSM Agent is pre-installed on modern Amazon Linux, Ubuntu, and Windows AMIs. For the agent to communicate with the SSM service, your instance needs:
- The SSM Agent installed and running
- An IAM Instance Profile with
AmazonSSMManagedInstanceCorepolicy - Outbound internet access (or a VPC endpoint for SSM)
# Check SSM agent status on Amazon Linux / Ubuntu
sudo systemctl status amazon-ssm-agent
# Start if stopped
sudo systemctl start amazon-ssm-agent
SSM Resource Groups & Tags
SSM integrates with AWS Resource Groups to let you target operations by tag.
Instance tags:
Environment = Production
Team = Platform
→ Create resource group "prod-platform-servers"
→ Target all Run Command / Patch operations at this group
Best practice: tag everything consistently. SSM is the payoff.
SSM Run Command
Run Command executes scripts or commands on one or many instances — no SSH required.
How it works
1. You select target instances (by ID, tag, or resource group)
2. You choose an SSM Document (a script definition)
3. SSM pushes the command to the SSM Agent on each instance
4. Results are returned to the console / S3 / CloudWatch Logs
Key SSM Documents
| Document | Purpose |
|---|---|
AWS-RunShellScript | Run bash commands on Linux |
AWS-RunPowerShellScript | Run PowerShell on Windows |
AWS-UpdateSSMAgent | Update the agent itself |
AWS-RunPatchBaseline | Trigger patching on an instance |
# Example: Run a shell command via AWS CLI
aws ssm send-command \
--instance-ids i-0abc123def456 \
--document-name "AWS-RunShellScript" \
--parameters '{"commands":["df -h","free -m"]}' \
--output-s3-bucket-name my-ssm-output-bucket
Rate Control
Run Command supports concurrency and error threshold settings — e.g., run on 10% of instances at a time, stop if 5% fail.
SSM Automation
SSM Automation runs multi-step operational playbooks (runbooks). Unlike Run Command (which runs on the instance), Automation runs workflows that can call AWS APIs, Lambda functions, and nested automations.
Common use cases
- Patching AMIs: stop instance → create AMI → patch → bake new AMI → update ASG launch template
- Remediating findings: triggered by Security Hub / Config Rules
- Scheduled maintenance: triggered by EventBridge
Key automation documents
| Document | Purpose |
|---|---|
AWS-StopEC2Instance | Stop an instance |
AWS-CreateImage | Create an AMI snapshot |
AWS-PatchInstanceWithRollback | Patch with automatic rollback |
AWSSupport-RunEC2RescueForLinux | Fix common Linux boot issues |
# Simplified automation runbook structure
schemaVersion: "0.3"
mainSteps:
- name: StopInstance
action: aws:changeInstanceState
inputs:
InstanceIds: ["{{ InstanceId }}"]
DesiredState: stopped
- name: CreateAMI
action: aws:createImage
inputs:
InstanceId: "{{ InstanceId }}"
ImageName: "Patched-{{ InstanceId }}-{{ global:DATE }}"
EventBridge → SSM Automation
A common CloudOps pattern: trigger automation from an event.
Config rule detects non-compliant resource
→ EventBridge rule fires
→ SSM Automation remediates it automatically
SSM Parameter Store
Parameter Store provides secure, hierarchical storage for configuration data and secrets.
Parameter types
| Type | Description | Example |
|---|---|---|
String | Plain text | db_host = rds.amazonaws.com |
StringList | CSV list | allowed_ips = 10.0.0.1,10.0.0.2 |
SecureString | KMS-encrypted | db_password = (encrypted) |
Tiers
| Standard | Advanced | |
|---|---|---|
| Max value size | 4 KB | 8 KB |
| Parameter policies | No | Yes (TTL / expiry) |
| Throughput | 40 TPS | 10,000 TPS |
| Cost | Free | $0.05/parameter/month |
# Store a parameter
aws ssm put-parameter \
--name "/myapp/prod/db_password" \
--value "SuperSecret123" \
--type SecureString
# Read it
aws ssm get-parameter \
--name "/myapp/prod/db_password" \
--with-decryption
# Read all parameters under a path
aws ssm get-parameters-by-path \
--path "/myapp/prod/" \
--with-decryption
Naming convention
Use a path hierarchy like /app/env/key — this makes IAM policies and bulk-reads clean.
Parameter Store vs Secrets Manager
| Feature | Parameter Store | Secrets Manager |
|---|---|---|
| Cost | Free (standard) | $0.40/secret/month |
| Auto rotation | No (use Lambda) | Built-in (RDS, Redshift) |
| Cross-account | No | Yes |
| Multi-region replication | No | Yes |
| Best for | Config + cheap secrets | Database credentials, API keys |
SSM Session Manager
Session Manager provides secure, browser- or CLI-based shell access to instances — with no open ports, no SSH keys, no bastion host.
How it works
User → IAM authenticated → Session Manager → SSM Agent → instance shell
(no inbound port 22 needed)
Key features
- All session activity logged to CloudWatch Logs or S3 (full audit trail)
- Works for instances in private subnets (no internet required with VPC endpoints)
- Supports both Linux (bash) and Windows (PowerShell)
# Start a session from AWS CLI
aws ssm start-session --target i-0abc123def456
# Port forwarding (tunnel local port 5432 → RDS in private subnet)
aws ssm start-session \
--target i-0abc123def456 \
--document-name AWS-StartPortForwardingSession \
--parameters '{"portNumber":["5432"],"localPortNumber":["5432"]}'
Warning
Session Manager requires the ssm:StartSession IAM permission AND the instance must have an SSM-enabled IAM role. Without either, sessions won't start.
SSM Patch Manager
Patch Manager automates OS and application patching across your fleet.
Core concepts
| Concept | Description |
|---|---|
| Patch Baseline | Defines which patches are approved/rejected, auto-approved rules |
| Patch Group | Tag (Patch Group = prod) that links instances to a baseline |
| Maintenance Window | Scheduled time window to run patching |
Patch baseline rules
Auto-approve after: 7 days
Severity: Critical, Important
Classification: SecurityUpdates, BugFix
→ Patches meeting criteria auto-approve 7 days after release
AWS-provided default baselines per OS
AWS-AmazonLinux2DefaultPatchBaselineAWS-UbuntuDefaultPatchBaselineAWS-WindowsServerDefaultPatchBaseline
Hands-on: patch your fleet
1. Create a Patch Baseline (or use AWS defaults)
2. Tag instances: Patch Group = prod
3. Associate the baseline with the patch group
4. Create a Maintenance Window:
- Schedule: cron(0 2 ? * SUN *) → every Sunday at 2am
- Duration: 2 hours
- Stop initiating: 1 hour before end
5. Register a Run Command task:
- Document: AWS-RunPatchBaseline
- Operation: Install
- Target: tag Patch Group = prod
6. Monitor in Patch Manager → Patch compliance
Compliance reporting
After patching, check SSM Patch Manager for compliance status — which instances are compliant, non-compliant, or missing patches. This feeds into AWS Security Hub.
SSM Fleet Manager
Fleet Manager is a unified UI for managing your entire instance fleet — view file system contents, performance metrics, running services, and user sessions — all without connecting to the instance.
Key capabilities:
- View OS performance metrics (CPU, memory, disk)
- Browse the remote file system
- Manage Windows Registry
- View and manage running services
- Start/stop Session Manager sessions
SSM Default Host Management Configuration (DHMC)
DHMC automatically configures new EC2 instances to use SSM as managed nodes — without manually attaching an IAM instance profile.
Enable DHMC in an AWS Region
→ Any EC2 instance launched (even without an explicit IAM profile)
→ Automatically gets the default AmazonSSMManagedInstanceCore role
→ Appears as managed node in SSM Fleet Manager
This is ideal for organisations that want all instances managed by default.
SSM Inventory & State Manager
Inventory
Inventory collects metadata from your instances on a schedule:
- Installed applications
- Network configuration
- Running services
- Windows roles
- Custom inventory data
Data is stored in an S3 bucket and queryable via Amazon Athena or AWS Config.
# Query inventory with Athena
SELECT instanceid, name, version
FROM "ssm_inventory"."aws_application"
WHERE name = 'nginx'
State Manager
State Manager ensures instances stay in a defined configuration (desired state). It runs SSM Documents on a schedule.
Document: AWS-ConfigureAWSPackage
Schedule: Every 30 minutes
Parameters: { "action": "Install", "name": "AmazonCloudWatchAgent" }
→ State Manager ensures CloudWatch Agent is always installed on all managed nodes
SSM Distributor
Distributor packages and deploys software to managed nodes. You can host your own packages or use AWS-provided ones (CloudWatch Agent, CodeDeploy Agent, Inspector Agent).
1. Create a package (ZIP with scripts + install manifest)
2. Upload to Distributor
3. Use "Install on a schedule" via State Manager
→ ensures package stays installed
SSM OpsCenter
OpsCenter aggregates operational issues (OpsItems) from CloudWatch Alarms, EventBridge rules, Security Hub findings, and Config non-compliance — providing a single place to investigate and remediate.
CloudWatch Alarm fires
→ Creates OpsItem in OpsCenter
→ Engineer investigates with related resources, runbooks, and history
→ Runs SSM Automation to remediate
→ OpsItem resolved
Hands-on: Session Manager for Private Instances
Goal: Connect to a private EC2 instance without a bastion host.
Prerequisites:
- Instance in a private subnet (no inbound security group rules needed)
- IAM Instance Profile with AmazonSSMManagedInstanceCore
- SSM VPC endpoints OR instance in subnet with NAT gateway (for SSM traffic)
VPC Endpoints needed (if no NAT):
- com.amazonaws.<region>.ssm
- com.amazonaws.<region>.ssmmessages
- com.amazonaws.<region>.ec2messages
Steps:
1. Create EC2 with the IAM role above (no key pair needed)
2. Confirm instance shows as "Online" in SSM Fleet Manager
3. In SSM → Session Manager → Start Session → select instance
4. Shell opens in browser — no SSH key, no open port 22!
Common SOA-C03 Exam Questions
Q: An instance shows as "offline" in SSM. What do you check first?
Check that the IAM instance profile has AmazonSSMManagedInstanceCore. Then verify the SSM Agent is running and the instance has outbound connectivity to SSM endpoints (or VPC endpoints are configured).
Q: How do you execute a command on 500 instances without SSH?
Use SSM Run Command with AWS-RunShellScript, targeting by tag or resource group. Set a concurrency rate (e.g., 10%) and an error threshold to control rollout.
Q: What's the difference between SSM Run Command and SSM Automation? Run Command executes commands directly on instances via the SSM Agent. Automation runs multi-step workflows that can call AWS APIs (not just instance commands) — useful for patching AMIs, orchestrating multi-resource operations, or triggering Lambda functions as part of a runbook.
Q: An app needs a database password without hardcoding it. What SSM feature do you use?
SSM Parameter Store with a SecureString parameter (KMS-encrypted). Retrieve it at runtime with GetParameter --with-decryption. Grant the instance IAM role ssm:GetParameter permission scoped to the specific path.
Common Mistakes
- No IAM role on instance — the #1 reason instances don't appear in SSM
- Using Parameter Store Standard for high-throughput — Standard tier is throttled at 40 TPS; use Advanced or cache values
- Patch Group tag case sensitivity — the tag key is
Patch Group(exact case); wrong casing breaks the baseline association - Forgetting SSM VPC endpoints for private subnets — private instances need 3 endpoints (ssm, ssmmessages, ec2messages) if no NAT gateway
What to Learn Next
- AWS CloudWatch Monitoring — alert on SSM compliance and operational metrics
- AWS CloudFormation for CloudOps — automate infrastructure alongside SSM automation
- AWS Security & Compliance — integrate SSM findings with Security Hub
