Amazon S3 for CloudOps — Storage, Security & Data Management

IntermediateTopic50 min12 min read26 Apr 2026AWS

S3 operations and governance for CloudOps. Versioning, replication, lifecycle policies, security, event notifications, Athena queries, and storage classes — all tested on SOA-C03.

What you'll learn

  • Manage S3 versioning, MFA delete, and replication
  • Design lifecycle policies to automate storage class transitions
  • Configure bucket policies, ACLs, and access controls
  • Use S3 Event Notifications to trigger downstream processing
  • Query S3 data with Athena
  • Perform bulk operations with S3 Batch Operations

Prerequisites

Relevant for certifications

SOA-C03SAA-C03

S3 Storage Classes

Choose the right storage class based on access frequency and retrieval requirements:

ClassUse CaseRetrievalMin DurationCost
S3 StandardFrequently accessedMillisecondsNoneHighest
S3 Standard-IAInfrequent access, rapid retrievalMilliseconds30 daysLower storage, retrieval fee
S3 One Zone-IAInfrequent, re-creatable dataMilliseconds30 days20% cheaper than IA
S3 Intelligent-TieringUnknown or changing access patternsMillisecondsNoneMonitoring fee per object
S3 Glacier InstantArchive, occasional accessMilliseconds90 daysLow
S3 Glacier FlexibleArchive, minutes-to-hours retrieval1–12 hours90 daysVery low
S3 Glacier Deep ArchiveLong-term archive, annual access12–48 hours180 daysLowest

Intelligent-Tiering

Intelligent-Tiering automatically moves objects between access tiers. Add the Deep Archive tier to extend to 180-day archives at near-zero cost. Ideal for unpredictable workloads.


S3 Versioning

Versioning keeps all versions of an object in a bucket — protecting against accidental deletion and overwrites.

Enable versioning on a bucket → all uploads create a new version
Delete an object → creates a delete marker (object not actually removed)
Restore deleted object → delete the delete marker
Permanently delete → specify version ID in delete request

Key facts

  • Once versioning is enabled, it cannot be disabled — only suspended
  • Suspended versioning still keeps existing versions; new objects get version null
  • Costs: you're billed for every version stored
# Enable versioning
aws s3api put-bucket-versioning \
  --bucket my-bucket \
  --versioning-configuration Status=Enabled

# List all versions
aws s3api list-object-versions --bucket my-bucket

# Restore a deleted object (remove the delete marker)
aws s3api delete-object \
  --bucket my-bucket \
  --key myfile.txt \
  --version-id <delete-marker-version-id>

MFA Delete

MFA Delete adds an extra protection layer — it requires MFA authentication to:

  • Permanently delete a versioned object
  • Suspend versioning
# Enable MFA Delete (requires root account + MFA device)
aws s3api put-bucket-versioning \
  --bucket my-bucket \
  --versioning-configuration Status=Enabled,MFADelete=Enabled \
  --mfa "arn:aws:iam::123456789:mfa/root-account-mfa-device 123456"

Warning

Only the root account can enable/disable MFA Delete. It cannot be set via IAM users — even admins.


S3 Replication

Cross-Region Replication (CRR) and Same-Region Replication (SRR) automatically copy objects between buckets.

Requirements

  • Versioning must be enabled on both source and destination buckets
  • IAM role with permissions to read source and write destination
  • Replication only applies to new objects after replication is enabled (not existing objects)

Use cases

TypeUse case
CRRDisaster recovery, latency reduction for global users, compliance (data residency)
SRRLog aggregation, test/prod sync in same region, audit copies
# Enable replication
aws s3api put-bucket-replication \
  --bucket source-bucket \
  --replication-configuration '{
    "Role": "arn:aws:iam::123456789:role/s3-replication-role",
    "Rules": [{
      "Status": "Enabled",
      "Filter": { "Prefix": "logs/" },
      "Destination": {
        "Bucket": "arn:aws:s3:::destination-bucket",
        "StorageClass": "STANDARD_IA"
      }
    }]
  }'

Replication advanced features

FeatureDescription
Replication Time Control (RTC)99.99% of objects replicated within 15 minutes — with SLA
Cross-account replicationRequires bucket policy on destination to allow source account
Bidirectional replicationConfigure replication in both directions; be careful of loops
Replica modification syncSync metadata changes after replication
Delete marker replicationOptionally replicate delete markers (disabled by default)

S3 Lifecycle Rules

Lifecycle rules automate transitioning objects to cheaper storage classes and deleting old objects.

Transition actions

S3 Standard
  → after 30 days → S3 Standard-IA
    → after 90 days → S3 Glacier Instant
      → after 180 days → S3 Glacier Deep Archive
        → after 365 days → DELETE

Expiration actions

# Example lifecycle configuration
Rules:
  - ID: "move-logs-to-archive"
    Status: Enabled
    Filter:
      Prefix: "logs/"
    Transitions:
      - Days: 30
        StorageClass: STANDARD_IA
      - Days: 90
        StorageClass: GLACIER
    Expiration:
      Days: 365

  - ID: "clean-incomplete-multipart"
    Status: Enabled
    AbortIncompleteMultipartUpload:
      DaysAfterInitiation: 7

  - ID: "delete-old-versions"
    Status: Enabled
    NoncurrentVersionExpiration:
      NoncurrentDays: 30

S3 Analytics

Use S3 Analytics to analyse access patterns before creating lifecycle rules — it suggests optimal transition points based on actual usage data. Takes 24–48 hours to populate.


S3 Event Notifications

Trigger downstream processing when objects are created, deleted, or restored.

Event types

EventTrigger
s3:ObjectCreated:*Any upload (PutObject, PostObject, Copy, CompleteMultipartUpload)
s3:ObjectRemoved:*Delete or delete marker creation
s3:ObjectRestore:*Glacier restore initiated/completed
s3:Replication:*Replication failure events

Targets

  • SNS — fan-out to multiple subscribers
  • SQS — queue for async processing
  • Lambda — serverless processing triggered by uploads
{
  "LambdaFunctionConfigurations": [{
    "LambdaFunctionArn": "arn:aws:lambda:us-east-1:123456789:function:process-upload",
    "Events": ["s3:ObjectCreated:*"],
    "Filter": {
      "Key": {
        "FilterRules": [{"Name": "suffix", "Value": ".jpg"}]
      }
    }
  }]
}

EventBridge integration

Alternatively, route S3 events through Amazon EventBridge for more flexible routing, filtering, and targeting. EventBridge supports 20+ target types vs the 3 native S3 notification targets.


S3 Security

Bucket Policies

Bucket policies are JSON-based resource policies attached to a bucket — they control access for any AWS principal:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowPublicRead",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-website-bucket/*"
    },
    {
      "Sid": "DenyNonSSL",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": ["arn:aws:s3:::my-bucket", "arn:aws:s3:::my-bucket/*"],
      "Condition": {
        "Bool": {"aws:SecureTransport": "false"}
      }
    }
  ]
}

Block Public Access

The Block Public Access setting (at account or bucket level) is the guardrail that prevents any public access regardless of bucket policies:

# Block all public access at account level
aws s3control put-public-access-block \
  --account-id 123456789 \
  --public-access-block-configuration \
    BlockPublicAcls=true,IgnorePublicAcls=true,\
    BlockPublicPolicy=true,RestrictPublicBuckets=true

S3 Access Logs

Enable server access logging to capture all requests to a bucket:

aws s3api put-bucket-logging \
  --bucket my-bucket \
  --bucket-logging-status '{
    "LoggingEnabled": {
      "TargetBucket": "my-access-logs-bucket",
      "TargetPrefix": "my-bucket-logs/"
    }
  }'

Warning

Don't enable access logs on the same bucket you're logging to — it will create an infinite logging loop.

S3 Object Lock & Glacier Vault Lock

FeatureDescriptionUse case
Object Lock — Governance modePrevent deletion unless user has special IAM permissionControlled protection
Object Lock — Compliance modeNo one (including root) can delete before retention periodRegulatory compliance (SEC, FINRA)
S3 Glacier Vault LockLock a Glacier vault policy permanentlyWORM (Write Once Read Many) archiving

IAM Access Analyzer for S3

Automatically reviews bucket policies and ACLs to identify buckets shared publicly or cross-account — surfaced in the S3 console as findings.


Amazon Athena

Athena is a serverless interactive query service that lets you run SQL directly on data stored in S3.

Data in S3 (CSV, JSON, Parquet, ORC, Avro)
  → Define schema in Glue Data Catalog
    → Query with standard SQL in Athena
      → Pay per TB scanned

Common CloudOps use cases

  • Query CloudTrail logs stored in S3
  • Query VPC Flow Logs for network analysis
  • Query CloudWatch Logs exports
  • Query S3 Inventory reports

Example: Query CloudTrail logs

-- Find who deleted an S3 bucket
SELECT eventtime, useridentity.username, sourceipaddress
FROM cloudtrail_logs
WHERE eventsource = 's3.amazonaws.com'
  AND eventname = 'DeleteBucket'
  AND eventtime > '2026-04-01'
ORDER BY eventtime DESC;

Performance optimisation

  • Partitioning: partition by year/month/day to reduce data scanned
  • Columnar formats: use Parquet or ORC for 10x less data scanned vs CSV
  • Compression: GZIP, Snappy for reducing storage and scan costs
  • Workgroup limits: cap maximum bytes scanned to prevent expensive queries

S3 Batch Operations

Run bulk operations on millions of S3 objects at once:

OperationDescription
CopyCopy objects to another bucket or storage class
Replace ACLApply new ACL to many objects
Restore from GlacierBulk restore
Invoke LambdaCall a Lambda function for each object
Apply Object LockBulk apply retention settings
ReplicateReplicate existing objects (replication only covers new objects by default)
# Create a batch operations job
aws s3control create-job \
  --account-id 123456789 \
  --operation '{"S3CopyObject": {"TargetBucket": "dest-bucket"}}' \
  --manifest '{"Spec": {"Format": "S3InventoryReport_CSV_20161130"}, 
    "Location": {"ObjectArn": "arn:aws:s3:::my-bucket/inventory/manifest.json"}}' \
  --report '{"Bucket": "arn:aws:s3:::my-reports-bucket", "Enabled": true}' \
  --priority 10 \
  --role-arn arn:aws:iam::123456789:role/BatchOpsRole

S3 Inventory

S3 Inventory generates CSV/ORC reports of all objects in a bucket — useful for auditing, lifecycle management planning, and Batch Operations manifests.

Schedule: daily or weekly
Destination: another S3 bucket
Format: CSV, ORC, or Parquet
Optional fields: size, last modified, storage class, replication status, encryption status

Multi-part Upload

For objects larger than 100 MB, multi-part upload is recommended:

  • Upload parts in parallel → faster
  • Resume failed uploads (retry individual parts)
  • Required for objects > 5 GB
# Single AWS CLI command handles multi-part automatically
aws s3 cp large-file.zip s3://my-bucket/ \
  --storage-class INTELLIGENT_TIERING

# Check for incomplete multi-part uploads (these cost money!)
aws s3api list-multipart-uploads --bucket my-bucket

# Lifecycle rule to auto-abort incomplete uploads after 7 days
# (see lifecycle rules section above)

Hands-on: S3 Cross-Region Replication for DR

Goal: Replicate prod bucket in us-east-1 to eu-west-1 for disaster recovery

1. Create source bucket: prod-data-us-east-1 (versioning: enabled)
2. Create destination bucket: prod-data-eu-west-1 (versioning: enabled)

3. Create IAM role: s3-replication-role
   Policy:
   - Allow: s3:GetObject, s3:GetObjectVersion, s3:GetObjectVersionAcl on source
   - Allow: s3:ReplicateObject, s3:ReplicateDelete on destination

4. Configure replication rule on source bucket:
   - Destination: prod-data-eu-west-1
   - IAM role: s3-replication-role
   - Enable Replication Time Control (RTC) for 15-min SLA

5. Enable Delete marker replication: Yes (for full sync)

6. For existing objects: use S3 Batch Operations with Copy to sync retroactively

7. Verify: upload a test file to source → confirm it appears in destination within 15 min

Hands-on: Create a Secure S3 Bucket

Goal: Create a private bucket with encryption, versioning, and public-access guardrails.

  1. Open S3 > Create bucket.
  2. Enter a globally unique bucket name such as cloudops-lab-<account-id>-<region>.
  3. Choose the Region closest to your lab resources.
  4. Keep Block all public access enabled.
  5. Enable Bucket Versioning.
  6. Set default encryption to SSE-S3 for a simple lab, or SSE-KMS if you need key-level audit and control.
  7. Add tags such as Environment = lab and Owner = cloudops.
  8. Create the bucket.
  9. Upload a test file.
  10. Open the object and confirm it is not public.
  11. Delete the object, then use Show versions to see the delete marker.
  12. Remove the delete marker to restore the object.

Hands-on: Add a Bucket Policy That Requires HTTPS

Goal: Deny any request that does not use TLS.

  1. Open the bucket.
  2. Go to Permissions > Bucket policy.
  3. Add a deny statement like this, replacing the bucket name:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyInsecureTransport",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::cloudops-lab-bucket",
        "arn:aws:s3:::cloudops-lab-bucket/*"
      ],
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}
  1. Save the policy.
  2. Test normal HTTPS access with the AWS CLI.
  3. Keep Block Public Access enabled unless you are intentionally building a public static website.

Hands-on: S3 Lifecycle Rule for Logs

Goal: Move old logs to cheaper storage and delete incomplete multipart uploads.

  1. Create a prefix named logs/ by uploading a small file such as logs/test.log.
  2. Open Management > Create lifecycle rule.
  3. Name the rule archive-logs.
  4. Scope it to prefix logs/.
  5. Add transitions: after 30 days to Standard-IA, after 90 days to Glacier Flexible Retrieval.
  6. Add expiration after 365 days if this is acceptable for the lab.
  7. Delete incomplete multipart uploads after 7 days.
  8. Save the rule.
  9. Review the rule summary and confirm it applies only to logs/.

Hands-on: Query S3 Logs with Athena

Goal: Query CSV or log data in S3 without loading it into a database.

  1. Create or choose an S3 bucket for Athena query results.
  2. Open Athena and set the query result location.
  3. Create a database:
CREATE DATABASE cloudops_logs;
  1. Create an external table for a simple CSV log prefix:
CREATE EXTERNAL TABLE cloudops_logs.web_logs (
  request_time string,
  client_ip string,
  method string,
  path string,
  status int
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LOCATION 's3://your-log-bucket/logs/';
  1. Run a query:
SELECT status, count(*) AS requests
FROM cloudops_logs.web_logs
GROUP BY status
ORDER BY requests DESC;
  1. Add partitions or convert to Parquet for real workloads to reduce scan cost.

Common SOA-C03 Exam Questions

Q: An S3 bucket must not be publicly accessible even if a developer accidentally sets a public ACL. How? Enable Block Public Access at the account level — it overrides any bucket policy or ACL that grants public access.

Q: Objects deleted from a versioned bucket are not gone — how do you permanently delete them? Deleting a versioned object creates a delete marker. To permanently delete, you must specify the version ID in the delete request, removing the specific version. To fully remove all versions, list all versions and delete each by version ID.

Q: How do you query VPC Flow Logs stored in S3 without loading them into a database? Use Amazon Athena — create an external table pointing to the S3 prefix where flow logs are stored, then run SQL queries. Partition the table by date to reduce scanned data and cost.

Q: A lifecycle rule isn't transitioning objects on time. What do you check? Check the minimum storage duration — Standard-IA has a 30-day minimum, Glacier has 90 days. Objects smaller than 128 KB are not transitioned to IA classes (not cost-effective).


What to Learn Next

  1. AWS Advanced Storage (FSx & Storage Gateway) — hybrid and high-performance storage
  2. AWS Security & Compliance — S3 encryption with KMS, Macie for data discovery
  3. AWS CloudWatch Monitoring — monitor S3 with CloudWatch metrics and Athena log analysis

More in Amazon Web Services