Before using Cloud Storage, it's essential to understand the differences between "Object Storage" and "Conventional Storage." Please refer to the diagram below:
Source:https://arthurcheng.gitbooks.io/ceph/content/chapter1/what-is-object.html
Block Storage
Block storage, as shown in the diagram above, consists of multiple data units (blocks). Typically, a single disk or disk array can be considered as block storage, with these raw disk spaces directly allocated for use by the host system. These blocks exist independently and are not stored contiguously. In terms of the CAP theorem, block storage belongs to strong consistency storage. Block storage serves as the underlying foundation for File Storage and Object Storage. The most representative solutions include DAS (Direct Attach Storage) and SAN (Storage Area Network), which excel in efficiency and speed but perform relatively poorly in file sharing. Moreover, they entail higher maintenance costs and technical thresholds.
Example: Suppose there are five physical hard drives labeled 1 through 5, each with a capacity of 2GB, totaling 10GB. I can further partition this into logical blocks of 5GB, 2GB, and 3GB (ABC). In the 5GB logical block A, for instance, 2GB may come from physical hard drive 1, another 2GB from physical hard drive 2, and 1GB from physical hard drive 3.
In other words, when storing data using Block Storage, the data is distributed across different blocks. When needed, it can be read from different blocks and combined to form complete data.
File Storage
As seen in the diagram, File Storage is a storage category with file structures, commonly found in NAS or the file storage formats of personal computers. File Storage utilizes file-centric transmission protocols over TCP/IP to achieve network-based file storage solutions. The advantage of File Storage lies in its complete object storage path, which is clear and easy to manage. However, the data recorded in metadata is limited.
Object Storage
As indicated in the diagram, the storage unit of Object Storage is an "object." Unlike block storage or file storage, there are no "blocks" or "file structures" visible. Object Storage stores data units on the same plane and performs data reading and writing in a key-value format. Objects can be continuously expanded, offering excellent scalability.
Object Storage stores data and metadata separately. When read or write requests are sent to the server, the actual data storage locations are determined based on the metadata, allowing users to access multiple disks simultaneously to improve read and write efficiency. Some file systems store all data, including metadata, together, splitting them into multiple small data blocks stored on different disks. Users must locate the first data block before proceeding with sequential read and write operations, resulting in lower efficiency.
Object Storage effectively combines the advantages of block storage and file systems, achieving fast read and write speeds and enabling file sharing among users.
Google Cloud Storage (GCS)
GCP offers various storage services, with Google Cloud Storage being one of the most commonly used and recognized services. It is a typical object storage service with scalability and support for API access. Typically, it is used for data backups, image file storage, and integrating with GCP's data services to construct ETL processes. Its storage units are called "buckets," categorized into three zone types and four storage classes as listed in the table below.
Three Zone Types
|
| |||
Multi-Region |
| Highest availability | ||
Dual-Region |
| Provides high availability and low latency | ||
Region |
| Lowest latency within a single region |
By:Ted
Four Storage Classes
|
|
| ||||
Standard |
|
| Unlimited | |||
Nearline |
| Medium | 30 days | |||
Coldline |
|
| 90 days | |||
Archive |
|
| 365 days |
It's important to choose the appropriate storage solution based on the characteristics and attributes of your business data. For instance, if you choose the Nearline storage class but access the data twice within 30 days, GCP will charge you based on the minimum storage duration. Additionally, while the Standard storage class offers unlimited access, if the access frequency is high, the accumulated cost may still be significant. On the other hand, although the other classes have higher per-access storage prices, considering their lower access frequencies, the overall cost may be lower compared to Standard.
After selecting the storage class, there are two access control mechanisms:
Access Control Mechanisms
| ||
Uniform | Belongs to Bucket-level permissions, ensuring consistent access control for the entire Bucket. | |
| Belongs to individual Object-level permissions, allowing access at the Bucket level as well as Object-level permissions (ACL). |
For security mechanisms, users have the option to choose between:
- Google-managed encryption keys.
- Customer-managed encryption keys (CMEK).
Furthermore, users can configure data retention policies based on their usage scenarios. Additionally, Google Cloud Storage offers a convenient feature - Lifecycle Management. This feature allows users to set conditions for data deletion, transition between storage classes (e.g., Standard to Coldline), and more. Google Cloud Storage is highly practical, serving not only as a data storage solution but also capable of hosting static websites.
Solution Architecture
吳祐德 Ted Wu