Google Cloud Storage (GCS)
GCS is used to store the unstructured data such as images, vidoes and other static content as well as backups, disaster recovery.
GCS stores data in the form of objects on an underlying proprietary distributed file system called as Colossus
We can transfer the data in our out of GCS using gsutil
Different offerings by Google based on usecase
Cloud Storage is simple an user friendly (think of Google Drive)
We dont need to allocate capacity
We can store files till infinity. GCS supports datasets of any size
Each file cannot have size more than 5 TB
This supports 5000 writes adn 1000 reads per second
In GCS when we want to store the data we need to create the storage buckets
- Buckets are basic containers that hold your data. Each Bucket has collection of objects
- Each bucket has a name which must be globally unique
- Objects are individual pieces of data that you store in Cloud Storage
- Object names are treated as object metadata. Object names should be less that 1024 bytes in length.
- Object name should be unique with in bucket
- Example object names
The access to the data in the buckets can be done as shown below
The objects in the GCS bucket are categories into two sections
- Cold data: Accessed infrequently
- Hot Data: Accessed frequently
- Buckets are mainly divided into four classes based on their availabilty,demographic relevance and frequency of access
- Hot and available – Multi Regional:
- This is Geo-Redundant. When we upload any data in this storage class two separte copies will be stored in two regions of gcp which are hundreds of miles away from each other
- This is most expensive class of buckets and should be used only when we are certain of global traffic.
- Hot and local – Regional:
- This is used for webapplications with highly concentrated traffic patterns with in a region
- Cool-Nearline: We use this storage class when we expect access data less frequently (once in a month). Here we pay less for storage and more for access
- Coldline: This is a cold storage facility for data that we would expect to access less than once in a year. Suitable for disaster recovery & long-term archival storage.
- Lets estimate some costs use pricing calculator Refer Here
- We need to store 10 TB of data
Working with GCS buckets
- Creating Buckets:
- For creating buckets in GCP there are multiple ways Console, gsutil, REST API.
- To create bucket we need three fiels
- Storage Class
- Univesally unique name
- Creating a bucket using Web Console
- Creating buckets using gsutil: