S3 stores your data, but who can see or modify it? Controlling access to your buckets is one of the most critical aspects of cloud security. Data breaches due to misconfigured buckets have made headlines many times. In this subchapter, you’ll learn how to control who accesses what, and the mistakes you should never make.

The starting point: everything is locked

Good news to start: by default, a new bucket is completely private. Only your account can access it. For anyone else (another account, a service, or the public) to access it, you have to explicitly grant it. AWS applies the principle of “deny by default” here.

The problem almost always comes when someone opens access too much by accident. Let’s look at the tools to open it (carefully).

Ways to control access

There are several ways to manage permissions in S3. The main ones:

  1. Bucket policies (the main tool)

A bucket policy is a document (in JSON format) attached to a bucket that defines detailed rules about who can do what with the objects.

Analogy: It’s like the access rules of a building, posted at the entrance: “delivery people can enter the warehouse; the public only to the reception; no one enters the offices without a card.”

A bucket policy specifies, in each rule:

  • Who (which account, user, or service, or “everyone”).
  • What action (read, write, delete…).
  • On which objects (the whole bucket or a part).
  • Allow or deny.

Conceptual example of a policy:

"Allow ANYONE (public) to READ the objects
 that are inside the /public/ folder of this bucket."

This would be useful, for example, for a static website (subchapter 5.5) where you want everyone to see certain files but not others.

Here’s a real example of a bucket policy that allows public reading of a folder:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PublicReadWeb",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-bucket-web/public/*"
    }
  ]
}

Don’t worry about understanding every line now; we’ll see it in detail with IAM in Chapter 7. The important parts: Effect is allow/deny, Principal is who, Action is what they can do, and Resource is what it applies to.

  1. ACLs (the old method)

ACLs (Access Control Lists) are an older and simpler mechanism for giving basic permissions to objects or buckets.

AWS Recommendation: Avoid ACLs. Nowadays, AWS recommends using bucket policies and IAM instead, because they are clearer and more powerful. In fact, AWS now disables ACLs by default on new buckets. You should know they exist (you’ll see them in old documentation), but don’t use them if you can avoid it.

  1. IAM Policies

You can also control access from the user side with IAM policies (Chapter 7): “this user can read from these buckets.” This complements bucket policies. The difference: bucket policies are attached to the resource (the bucket), IAM policies are attached to the identity (the user or role).

The safety net: Block Public Access

AWS learned from the many incidents of exposed buckets and created an extra protection called Block Public Access.

It’s a security switch that, when enabled (enabled by default on new buckets), prevents the bucket from being made public, even if someone mistakenly writes a policy that allows it. It’s like a double lock.

⚠️ Golden rule: Leave Block Public Access enabled unless you have a very specific and conscious reason to disable it (such as hosting a public website). And even then, consider putting CloudFront in front instead of exposing the bucket directly. Disabling it “just to test” and forgetting about it is exactly how breaches happen.

Mistakes you should NEVER make

These are the errors that have caused famous data breaches:

  1. Accidentally making a bucket public. If you put sensitive data (customer data, backups) in a public bucket, anyone on the internet can download it. This has exposed millions of records at real companies.
  2. Disabling Block Public Access unnecessarily.
  3. Giving overly broad permissions (“anyone can write/delete”). An attacker could delete or hijack your data.
  4. Using legacy ACLs instead of clear policies.

Real case (repeated pattern): Numerous companies have suffered breaches because a developer left a bucket with customer data configured as public “temporarily” and forgot about it. Security researchers (and attackers) scan the internet looking for these open buckets. The lesson: treat every bucket as private by default and open only what is strictly necessary.

Best practices summarized

  • Private by default: open access only when strictly necessary.
  • Least privilege: give only the necessary permissions, not one more (key concept we’ll see in Chapter 7).
  • Use bucket policies and IAM, not ACLs.
  • Keep Block Public Access enabled unless there’s a justified exception.
  • Encrypt your data (S3 encrypts at rest by default; we’ll see this with KMS in Chapter 23).
  • For public websites, serve through CloudFront instead of exposing the bucket directly.

What you should remember

  • Buckets are private by default; you grant access explicitly.
  • Bucket policies (JSON) are the main and recommended way to control access: they define who, what action, on what, and whether it’s allowed or denied.
  • ACLs are the old method: avoid them, AWS disables them by default.
  • Block Public Access is a safety net that prevents exposing the bucket by mistake: leave it enabled.
  • The most serious and frequent mistake is making a bucket with sensitive data public. Private by default, least privilege.

In the last S3 subchapter, we’ll see a very practical and popular use: hosting a static website directly on S3.

Cloud, AWS & Terraform — From Zero to Expert

Chapter 1 · What is cloud computing

Chapter 2 · The cloud market and major providers

Chapter 3 · Regions, availability zones and edge

Chapter 4 · Compute: EC2

Chapter 5 · Storage: S3

Chapter 6 · Networking: VPC

Chapter 7 · Identity and access: IAM

Chapter 8 · Managed databases

Chapter 9 · Why Infrastructure as Code

Chapter 10 · HCL: the Terraform language

Chapter 11 · Providers and state

Chapter 12 · Your first real infrastructure in Terraform

Chapter 13 · Load balancing and auto scaling

Chapter 14 · Serverless with Lambda

Chapter 15 · Messaging and events

Chapter 16 · Content delivery and DNS

Chapter 17 · Containers on AWS

Chapter 18 · Modules: reuse and composition

Chapter 19 · Workspaces and environment management

Chapter 20 · Remote backends and locking

Chapter 21 · Infrastructure testing

Chapter 22 · Terraform in CI/CD

Chapter 23 · Defense in depth

Chapter 24 · Observability: logs, metrics and traces

Chapter 25 · Cost optimization

Chapter 26 · High availability and disaster recovery

Chapter 27 · AWS Well-Architected Framework

Chapter 28 · Serverless architectures at scale

Chapter 29 · Data platforms on AWS

Chapter 30 · Multi-account and landing zones

Chapter 31 · Platform Engineering and Internal Developer Platform

Chapter 32 · Relevant AWS certifications

Chapter 33 · Projects to consolidate what you've learned

Chapter 34 · Resources and community

© Copyright 2024. All rights reserved