In this quick read, let's understand about AWS S3 storage bucket retention policy and its benefits.
Lifecycle policy
A lifecycle policy is used to move objects in your bucket, automatically from one storage class to another.
S3 storage classes
- S3 Standard : S3 Standard offers high durability, availability, and performance object storage for frequently accessed data.
- S3 Intelligent-Tiering: The S3 Intelligent-Tiering storage class is designed to optimise costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead.
- S3 Standard-IA : S3 Standard-IA is for data that is accessed less frequently, but requires rapid access when needed.
- S3 One Zone-IA : S3 One Zone-IA is for data that is accessed less frequently, but requires rapid access when needed. S3 One Zone-IA stores data in a single Availability Zone and costs 20% less than S3 Standard-IA.
- S3 Glacier : S3 Glacier is a secure, durable, and low-cost storage class for data archiving.
Benefits of retention policies
Cost Optimisation: Rules / Policies will help manage your storage costs by controlling the lifecycle of your objects. Create a lifecycle rule to automatically transition your objects to Standard-IA storage class, archive them to Glacier storage class, and remove them after a specified time period.
Logs Lifecycle Automation: You upload logs to S3 bucket, and you need those logs for specific period of time. For e.g., one month or three months. After that, you may want to archive or delete them.
Quick Access: To begin with, you place your files in a frequently accessed storage type. But after sometimes, you realise that the files will not be accessed frequently, and you want to archive them for a specific period of time. You might also decide to delete them later.
Steps to setup the lifecycle / retention of objects
Let's proceed with the assumption that you are already sending your application logs to S3. We will focus on configuring the lifecycle of the logs. In this case, let's move the logs from Standard to One Zone-IA class storage.
- Log on to AWS console -> click on Services -> select S3
- Click on the bucket you have created, now click on the
Management section, and select the Lifecycle****option and click
on Add lifecycle rule. - After clicking ‘Add lifecycle rule’, a window will appear
[add rule name as per your requirement and choose rule scope]
- Prefixes and Tags: If your bucket has folders and tags, then you can add their names here, in this prefix and tag fields. This will help you to differentiate between different folder’s lifecycle process. If you don’t have any sub folders, then select your whole bucket (in my case, I have selected whole bucket).
- Now click on Next
- In the transitioning step, we will add our lifecycle rules.
- Select your bucket version (if you already enabled versioning while creating a bucket and you want to transition your logs according to versions, select the previous version, if not select current version)
- Now add the transition, and enter your rules. For e.g., in how many days, you
want to move your objects to different storage classes. - Here, I am moving the files from Standard to OneZone-IA storage after 180 days and to the Glacier after 365 days of creation.
- In the next step, set the expiration of the objects. In how many days, after the object’s creation date, an object should get automatically deleted. Here, I am deleting all the files after 366 days of creation.
- As we upload a large number of files in S3, many times, the files fail to upload. In such cases, we can delete the incomplete files in 10 days.
- Click on Next, and review the options you have entered/selected. Once reviewed, click on Create. Your rule will be created and attached to the bucket. You can also see the created life cycle policy under the management section in your bucket.
With few steps, you have learnt to configure your AWS S3 storage in an optimal and cost-effective manner.
Also check out this blog post about using spot to achieve cost savings on Kubernetes.