AWS Systems Manager (SSM) — Managing EC2 at Scale

IntermediateTopic50 min10 min read26 Apr 2026AWS

Manage fleets of EC2 instances at scale. Covers Run Command, Session Manager, Patch Manager, Parameter Store, and automation — all key SOA-C03 topics.

What you'll learn

  • Understand what SSM is and how the SSM Agent enables management without SSH
  • Use Run Command to execute scripts across a fleet of instances
  • Manage secrets and config using Parameter Store
  • Patch instances at scale using Patch Manager and Maintenance Windows
  • Establish secure shell-less access via Session Manager
  • Automate operational tasks using SSM Automation

Prerequisites

Relevant for certifications

SOA-C03

What is AWS Systems Manager?

AWS Systems Manager (SSM) is a unified operations platform for managing AWS resources at scale. It provides a secure, agent-based channel to your EC2 instances — without opening SSH ports or maintaining bastion hosts.

Why SSM matters for CloudOps

SSM is the backbone of AWS operational automation. The SOA-C03 exam tests it heavily. Know every sub-service.

The SSM Agent is pre-installed on modern Amazon Linux, Ubuntu, and Windows AMIs. For the agent to communicate with the SSM service, your instance needs:

  1. The SSM Agent installed and running
  2. An IAM Instance Profile with AmazonSSMManagedInstanceCore policy
  3. Outbound internet access (or a VPC endpoint for SSM)
# Check SSM agent status on Amazon Linux / Ubuntu
sudo systemctl status amazon-ssm-agent

# Start if stopped
sudo systemctl start amazon-ssm-agent

SSM Resource Groups & Tags

SSM integrates with AWS Resource Groups to let you target operations by tag.

Instance tags:
  Environment = Production
  Team        = Platform

→ Create resource group "prod-platform-servers"
→ Target all Run Command / Patch operations at this group

Best practice: tag everything consistently. SSM is the payoff.


SSM Run Command

Run Command executes scripts or commands on one or many instances — no SSH required.

How it works

1. You select target instances (by ID, tag, or resource group)
2. You choose an SSM Document (a script definition)
3. SSM pushes the command to the SSM Agent on each instance
4. Results are returned to the console / S3 / CloudWatch Logs

Key SSM Documents

DocumentPurpose
AWS-RunShellScriptRun bash commands on Linux
AWS-RunPowerShellScriptRun PowerShell on Windows
AWS-UpdateSSMAgentUpdate the agent itself
AWS-RunPatchBaselineTrigger patching on an instance
# Example: Run a shell command via AWS CLI
aws ssm send-command \
  --instance-ids i-0abc123def456 \
  --document-name "AWS-RunShellScript" \
  --parameters '{"commands":["df -h","free -m"]}' \
  --output-s3-bucket-name my-ssm-output-bucket

Rate Control

Run Command supports concurrency and error threshold settings — e.g., run on 10% of instances at a time, stop if 5% fail.


SSM Automation

SSM Automation runs multi-step operational playbooks (runbooks). Unlike Run Command (which runs on the instance), Automation runs workflows that can call AWS APIs, Lambda functions, and nested automations.

Common use cases

  • Patching AMIs: stop instance → create AMI → patch → bake new AMI → update ASG launch template
  • Remediating findings: triggered by Security Hub / Config Rules
  • Scheduled maintenance: triggered by EventBridge

Key automation documents

DocumentPurpose
AWS-StopEC2InstanceStop an instance
AWS-CreateImageCreate an AMI snapshot
AWS-PatchInstanceWithRollbackPatch with automatic rollback
AWSSupport-RunEC2RescueForLinuxFix common Linux boot issues
# Simplified automation runbook structure
schemaVersion: "0.3"
mainSteps:
  - name: StopInstance
    action: aws:changeInstanceState
    inputs:
      InstanceIds: ["{{ InstanceId }}"]
      DesiredState: stopped
  - name: CreateAMI
    action: aws:createImage
    inputs:
      InstanceId: "{{ InstanceId }}"
      ImageName: "Patched-{{ InstanceId }}-{{ global:DATE }}"

EventBridge → SSM Automation

A common CloudOps pattern: trigger automation from an event.

Config rule detects non-compliant resource
  → EventBridge rule fires
    → SSM Automation remediates it automatically

SSM Parameter Store

Parameter Store provides secure, hierarchical storage for configuration data and secrets.

Parameter types

TypeDescriptionExample
StringPlain textdb_host = rds.amazonaws.com
StringListCSV listallowed_ips = 10.0.0.1,10.0.0.2
SecureStringKMS-encrypteddb_password = (encrypted)

Tiers

StandardAdvanced
Max value size4 KB8 KB
Parameter policiesNoYes (TTL / expiry)
Throughput40 TPS10,000 TPS
CostFree$0.05/parameter/month
# Store a parameter
aws ssm put-parameter \
  --name "/myapp/prod/db_password" \
  --value "SuperSecret123" \
  --type SecureString

# Read it
aws ssm get-parameter \
  --name "/myapp/prod/db_password" \
  --with-decryption

# Read all parameters under a path
aws ssm get-parameters-by-path \
  --path "/myapp/prod/" \
  --with-decryption

Naming convention

Use a path hierarchy like /app/env/key — this makes IAM policies and bulk-reads clean.

Parameter Store vs Secrets Manager

FeatureParameter StoreSecrets Manager
CostFree (standard)$0.40/secret/month
Auto rotationNo (use Lambda)Built-in (RDS, Redshift)
Cross-accountNoYes
Multi-region replicationNoYes
Best forConfig + cheap secretsDatabase credentials, API keys

SSM Session Manager

Session Manager provides secure, browser- or CLI-based shell access to instances — with no open ports, no SSH keys, no bastion host.

How it works

User → IAM authenticated → Session Manager → SSM Agent → instance shell
                                          (no inbound port 22 needed)

Key features

  • All session activity logged to CloudWatch Logs or S3 (full audit trail)
  • Works for instances in private subnets (no internet required with VPC endpoints)
  • Supports both Linux (bash) and Windows (PowerShell)
# Start a session from AWS CLI
aws ssm start-session --target i-0abc123def456

# Port forwarding (tunnel local port 5432 → RDS in private subnet)
aws ssm start-session \
  --target i-0abc123def456 \
  --document-name AWS-StartPortForwardingSession \
  --parameters '{"portNumber":["5432"],"localPortNumber":["5432"]}'

Warning

Session Manager requires the ssm:StartSession IAM permission AND the instance must have an SSM-enabled IAM role. Without either, sessions won't start.


SSM Patch Manager

Patch Manager automates OS and application patching across your fleet.

Core concepts

ConceptDescription
Patch BaselineDefines which patches are approved/rejected, auto-approved rules
Patch GroupTag (Patch Group = prod) that links instances to a baseline
Maintenance WindowScheduled time window to run patching

Patch baseline rules

Auto-approve after: 7 days
Severity: Critical, Important
Classification: SecurityUpdates, BugFix

→ Patches meeting criteria auto-approve 7 days after release

AWS-provided default baselines per OS

  • AWS-AmazonLinux2DefaultPatchBaseline
  • AWS-UbuntuDefaultPatchBaseline
  • AWS-WindowsServerDefaultPatchBaseline

Hands-on: patch your fleet

1. Create a Patch Baseline (or use AWS defaults)
2. Tag instances: Patch Group = prod
3. Associate the baseline with the patch group
4. Create a Maintenance Window:
   - Schedule: cron(0 2 ? * SUN *)  → every Sunday at 2am
   - Duration: 2 hours
   - Stop initiating: 1 hour before end
5. Register a Run Command task:
   - Document: AWS-RunPatchBaseline
   - Operation: Install
   - Target: tag Patch Group = prod
6. Monitor in Patch Manager → Patch compliance

Compliance reporting

After patching, check SSM Patch Manager for compliance status — which instances are compliant, non-compliant, or missing patches. This feeds into AWS Security Hub.


SSM Fleet Manager

Fleet Manager is a unified UI for managing your entire instance fleet — view file system contents, performance metrics, running services, and user sessions — all without connecting to the instance.

Key capabilities:

  • View OS performance metrics (CPU, memory, disk)
  • Browse the remote file system
  • Manage Windows Registry
  • View and manage running services
  • Start/stop Session Manager sessions

SSM Default Host Management Configuration (DHMC)

DHMC automatically configures new EC2 instances to use SSM as managed nodes — without manually attaching an IAM instance profile.

Enable DHMC in an AWS Region
  → Any EC2 instance launched (even without an explicit IAM profile)
    → Automatically gets the default AmazonSSMManagedInstanceCore role
      → Appears as managed node in SSM Fleet Manager

This is ideal for organisations that want all instances managed by default.


SSM Inventory & State Manager

Inventory

Inventory collects metadata from your instances on a schedule:

  • Installed applications
  • Network configuration
  • Running services
  • Windows roles
  • Custom inventory data

Data is stored in an S3 bucket and queryable via Amazon Athena or AWS Config.

# Query inventory with Athena
SELECT instanceid, name, version
FROM "ssm_inventory"."aws_application"
WHERE name = 'nginx'

State Manager

State Manager ensures instances stay in a defined configuration (desired state). It runs SSM Documents on a schedule.

Document: AWS-ConfigureAWSPackage
Schedule: Every 30 minutes
Parameters: { "action": "Install", "name": "AmazonCloudWatchAgent" }

→ State Manager ensures CloudWatch Agent is always installed on all managed nodes

SSM Distributor

Distributor packages and deploys software to managed nodes. You can host your own packages or use AWS-provided ones (CloudWatch Agent, CodeDeploy Agent, Inspector Agent).

1. Create a package (ZIP with scripts + install manifest)
2. Upload to Distributor
3. Use "Install on a schedule" via State Manager
   → ensures package stays installed

SSM OpsCenter

OpsCenter aggregates operational issues (OpsItems) from CloudWatch Alarms, EventBridge rules, Security Hub findings, and Config non-compliance — providing a single place to investigate and remediate.

CloudWatch Alarm fires
  → Creates OpsItem in OpsCenter
    → Engineer investigates with related resources, runbooks, and history
      → Runs SSM Automation to remediate
        → OpsItem resolved

Hands-on: Session Manager for Private Instances

Goal: Connect to a private EC2 instance without a bastion host.

Prerequisites:
- Instance in a private subnet (no inbound security group rules needed)
- IAM Instance Profile with AmazonSSMManagedInstanceCore
- SSM VPC endpoints OR instance in subnet with NAT gateway (for SSM traffic)

VPC Endpoints needed (if no NAT):
- com.amazonaws.<region>.ssm
- com.amazonaws.<region>.ssmmessages
- com.amazonaws.<region>.ec2messages

Steps:
1. Create EC2 with the IAM role above (no key pair needed)
2. Confirm instance shows as "Online" in SSM Fleet Manager
3. In SSM → Session Manager → Start Session → select instance
4. Shell opens in browser — no SSH key, no open port 22!

Common SOA-C03 Exam Questions

Q: An instance shows as "offline" in SSM. What do you check first? Check that the IAM instance profile has AmazonSSMManagedInstanceCore. Then verify the SSM Agent is running and the instance has outbound connectivity to SSM endpoints (or VPC endpoints are configured).

Q: How do you execute a command on 500 instances without SSH? Use SSM Run Command with AWS-RunShellScript, targeting by tag or resource group. Set a concurrency rate (e.g., 10%) and an error threshold to control rollout.

Q: What's the difference between SSM Run Command and SSM Automation? Run Command executes commands directly on instances via the SSM Agent. Automation runs multi-step workflows that can call AWS APIs (not just instance commands) — useful for patching AMIs, orchestrating multi-resource operations, or triggering Lambda functions as part of a runbook.

Q: An app needs a database password without hardcoding it. What SSM feature do you use? SSM Parameter Store with a SecureString parameter (KMS-encrypted). Retrieve it at runtime with GetParameter --with-decryption. Grant the instance IAM role ssm:GetParameter permission scoped to the specific path.


Common Mistakes

  • No IAM role on instance — the #1 reason instances don't appear in SSM
  • Using Parameter Store Standard for high-throughput — Standard tier is throttled at 40 TPS; use Advanced or cache values
  • Patch Group tag case sensitivity — the tag key is Patch Group (exact case); wrong casing breaks the baseline association
  • Forgetting SSM VPC endpoints for private subnets — private instances need 3 endpoints (ssm, ssmmessages, ec2messages) if no NAT gateway

What to Learn Next

  1. AWS CloudWatch Monitoring — alert on SSM compliance and operational metrics
  2. AWS CloudFormation for CloudOps — automate infrastructure alongside SSM automation
  3. AWS Security & Compliance — integrate SSM findings with Security Hub

More in Amazon Web Services