Storage Services
Amazon S3
What it is:
Amazon S3 (Simple Storage Service) is AWS’s object storage service designed to store and retrieve any amount of data from anywhere on the web.
Why it matters:
- It's scalable, durable (99.999999999%), and cost-effective
- Frequently used to store training datasets, model outputs, logs, and documents
- Integrates seamlessly with services like SageMaker, Bedrock, and Lambda
Typical Use Cases:
- Storing datasets for AI/ML model training
- Hosting website files or media assets
- Saving logs and predictions from AI pipelines
- Backup and recovery of application data
Amazon S3 Glacier
What it is:
Amazon S3 Glacier is a low-cost storage service for data archiving and long-term backup. It is designed for data that is infrequently accessed but must be retained securely for years.
Why it matters:
- Ideal for archiving training datasets or compliance logs
- Offers different retrieval speeds (minutes to hours)
- Cost-effective for storing AI/ML data not actively used
Typical Use Cases:
- Archiving large ML datasets not currently in use
- Storing compliance and audit data for AI projects
- Backing up AI-generated reports, logs, and checkpoints