AWS Classroom Series – 01/Dec/2021

S3 Contd

  • S3 is accessed over HTTP(S), whether we access the S3 from browser, CLI or programatically internally all of these will be called over REST API
  • Operations on S3
HTTP Verb CRUD operations in AWS S3
GET Read
PUT Create/Update
DELETE Delete
POST Create
  • The S3 URI for the CLI is written as s3://<bucket-name>

AWS S3 Data Conistency Model

  • S3 is a object store which can be accesed over web (web-store).
  • The S3 service is intended to "write once ready many" usecase.
  • Thats the reason why S3 architecture is different from traditional File System or network SAM architecutre.
  • S3 Infrastructure Preview
  • The above image shows only AZs whereas in real life Standard s3 uses a minimum of three AZ’s to store the data.
  • When we create a new object, the data will be synchronously stored across multiple facilities before returning success, this provides the read-after-write consistency
  • For all the other objects (apart from new ones), S3 is eventually consistent

AWS S3 Performance Considerations

  • Its important to understand the best practice for partitioning the workload if your are planning to run on S3 bucket is going to exceed 100 PUT/LIST/DELETE requests per second or 300 GET requests per second.
  • In this case we need to make sure we follow the partitioning guidelines so that we don’t end up with performance bottleneck.
  • S3 bucket is unique and every object name (key) in your bucket can be identified uniquely across the globe
  • AWS S3 scales to support very high request rate, to do so S3 automatically partitions all your buckets
  • Object keys are stored in UTF-8 binary with a maximum size of 1024 bytes
  • Bucket name is qtaws and we have an image image2.1.png
Bucket      Object Key
qtaws       images/image2.1.png
  • Lets assume you have 20 objects
images/image2.1.png
images/image2.2.png
images/image2.3.png
images/image3.1.png
images/image3.2.png
images/image3.3.png
images/image3.4.png
images/image4.1.png
...
  • In this everything is falling under same partition here qtaws/i , since the partition key is i
  • To solve this problem
2images/image2.1.png
2images/image2.2.png
2images/image2.3.png
3images/image3.1.png
3images/image3.2.png
3images/image3.3.png
3images/image3.4.png
4images/image4.1.png
  • we have distribute our objects into following partitions instead of one
qtaws/1
qtaws/2
qtaws/3
qtaws/4
qtaws/5
qtaws/6
...
qtaws/9
  • Reverse the Key Name String
    • When you are uploading the data from your application, with every set of uploads the sequence of application ID increases by 1.
    applicationid/421212342/log.txt
    applicationid/421212342/error.txt
    applicationid/421212343/log.txt
    applicationid/421212343/error.txt
    applicationid/421212344/log.txt
    applicationid/421212344/error.txt
    ....
    
    • IN this case everything falls under the same partition applicationid/4, simply reverse the key to solve the partition issue
    applicationid/243212124/log.txt
    applicationid/243212124/error.txt
    applicationid/343212124/log.txt
    applicationid/343212124/error.txt
    applicationid/443212124/log.txt
    applicationid/443212124/error.txt
    ...
    
    • Now we have solve the partitioning issue
    • or the other way to resolve this issue try to add the hash prefix
    application/178D421212342/log.txt
    application/178D421212342/error.txt
    application/CEAB421212343/log.txt
    application/CEAB421212343/error.txt
    ...
    

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Please turn AdBlock off
Customized Social Media Icons from Acurax Digital Marketing Agency

Discover more from Direct DevOps from Quality Thought

Subscribe now to keep reading and get access to the full archive.

Continue reading

Visit Us On FacebookVisit Us On LinkedinVisit Us On Youtube