AWS S3 Lifecycle Rules & Bucket Policy Guide
Overview
Amazon S3 provides:
- Lifecycle Rules → Automate object management (transition, expiration)
- Bucket Policies → Control access permissions to your S3 resources
S3 Lifecycle Rules
Lifecycle rules help you manage storage costs and data retention automatically.
Common Use Cases
- Move objects to cheaper storage (e.g., Glacier)
- Delete old files automatically
- Clean up incomplete uploads
Example Lifecycle Rule (JSON)
{
"Rules": [
{
"ID": "MoveToGlacierAndExpire",
"Status": "Enabled",
"Filter": {
"Prefix": "logs/"
},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 60,
"StorageClass": "GLACIER"
}
],
"Expiration": {
"Days": 365
}
}
]
}
Explanation
| Field | Meaning |
|---|---|
| ID | Rule name |
| Status | Enabled / Disabled |
| Filter/Prefix | Applies to objects with this prefix |
| Transitions | Move objects to cheaper storage |
| Expiration | Delete objects after X days |
Storage Classes
- STANDARD
- STANDARD_IA (Infrequent Access)
- GLACIER (Instant, Flexible )
- DEEP_ARCHIVE
Example Bucket Policy (Public Read Access)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadAccess",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket-name/*"
}
]
}
Explanation
| Field | Meaning |
|---|---|
| Effect | Allow / Deny |
| Principal | Who can access |
| Action | Allowed action |
| Resource | Bucket or object ARN |
Example: Restrict Access by IP
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "IPRestriction",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::my-bucket-name",
"arn:aws:s3:::my-bucket-name/*"
],
"Condition": {
"NotIpAddress": {
"aws:SourceIp": "192.168.1.0/24"
}
}
}
]
}
Best Practices
- Enable lifecycle rules for cost optimization
- Avoid public access unless necessary
- Use IAM roles instead of
Principal: "*" - Enable versioning with lifecycle rules
- Test policies using AWS Policy Simulator
Typical Setup
- Lifecycle → Auto-delete logs after 90 days
- Policy → Restrict access to internal users only
AWS S3 Advanced Bucket Types
Directory Buckets, Table Buckets, and Vector Buckets
Amazon S3 now includes specialized bucket types for different workloads:
- Directory Buckets → High-performance file-like storage
- Table Buckets → Structured data for analytics
- Vector Buckets → AI/ML embedding storage
Directory Buckets
Directory buckets are designed for low-latency, high-performance workloads and provide a hierarchical structure similar to a file system.
- Built for S3 Express One Zone
- Faster than standard S3
- Supports directory-style access
Features
- Ultra-low latency (milliseconds/sub-ms)
- Hierarchical namespace (real folder-like behavior)
- High throughput
- Optimized for compute-heavy tasks
Example Structure
project-data/
│
├── user1/
│ ├── file1.txt
│ └── file2.txt
│
└── user2/
└── file3.txt
Use Cases
- Machine Learning pipelines
- Real-time analytics
- High-performance computing (HPC)
- AI training workloads
Table Buckets
Table buckets are used for storing structured data in tabular format, mainly for analytics.
- Supports Apache Iceberg tables
- Works with query engines
Key Features
- Schema-based storage
- Optimized for queries
- Columnar data formats
- Integration with analytics tools
Example Table
| user_id | name | age |
|---|---|---|
| 101 | Ram | 25 |
| 102 | Sita | 30 |
Stored as:
s3://table-bucket/users/
Use Cases
- Data lakes
- Business intelligence (BI)
- Reporting systems
- ETL pipelines
Vector Buckets
Vector buckets are designed to store and query vector embeddings used in AI/ML applications.
Key Features
- Stores high-dimensional vectors
- Supports similarity search (k-NN)
- Optimized for semantic queries
- Works with AI/ML systems
Example Vector
[0.12, 0.98, 0.45, 0.33]
Use Cases
- Semantic search
- Chatbots and AI assistants
- Recommendation systems
- Image and video similarity
Comparison Table
| Feature | Directory Bucket | Table Bucket | Vector Bucket |
|---|---|---|---|
| Data Type | Files | Structured tables | Vector embeddings |
| Structure | Hierarchical | Tabular | Numeric vectors |
| Performance | Ultra-low latency | Query optimized | Similarity optimized |
| Use Case | HPC, ML workloads | Analytics, BI | AI/ML search |
| Example | File system | Data lake table | Embedding store |
Summary
- Directory Buckets → Speed + folder structure
- Table Buckets → Structured analytics data
- Vector Buckets → AI similarity search
