Keep that data growth in check with object storage

Data Growth and Object StorageThere is no shortage of incredible statistics about the tremendous growth of data in recent years. Here are just a few to add to the collection:

  • 90 percent of all data was created in the past two years
  • IDC projects a data growth CAGR of 42 percent through 2020
  • A widely quoted statement from Hal Varian, Chief Economist at Google, really hits home: “Between the dawn of civilization and 2003, we only created five exabytes; now we’re creating that amount every two days. By 2020, that figure is predicted to sit at 53 zettabytes (53 trillion gigabytes) — an increase of 50 times.”

We probably don’t even have to tell you about this explosive growth, most companies are dealing with what to do with all this data in some way or another. Here’s the kicker:

80 percent of the data being created is unstructured.

What exactly is unstructured data? Well, as the statistic suggests, it’s basically everything: emails and texts; documents, PDFs, presentations; images; audio files; video files; and so on.

It’s no wonder companies are looking for better, more efficient ways to store it all. Enter object storage. For companies looking for a scalable, reliable, easy-to-manage, and cost effective solution, object storage has become a great option. There’s been one interesting thing preventing it from efficiently solving data problems until recently.

Reliability through replication of data

Historically, object storage has provided reliability by making copies of your data. To be truly scalable, though, object storage didn’t use metadata. Instead, each piece of information would have a unique identifier, and using that identifier you could make replicas of your data to confirm your reliability needs.

The first issue with this system was pretty obvious, if you lose that identifier, you effectively lose your data. Solving this problem is a topic for another blog. We want to focus on the storage issue: to store 1 terabyte of data, you actually needed 3 terabytes of storage to have the three copies.

Solving the data usage problem with STaaS

This 3:1 ratio of usage to data is still better than traditional block storage’s approximately 3.6 terabyte usage for 1 terabyte of data. A lot of this difference can be explained by metadata, which is one of the reasons object storage is more efficient to begin with.

Our object storage as a service (STaaS), however, brings that number down to about 1.7 terabytes of raw storage capacity per terabyte of usable storage. One of the ways we achieve this is with our partner IBM’s erasure coding. With erasure coding, you define algorithms based on the amount of redundancy your data needs, which then get rid of unnecessary copies. Think about your data load. Even a small change here, 10-20 percent, can eliminate the need for hundreds of terabytes of storage capacity.

Our partnership with Switch adds to this efficiency. Whereas traditional object storage would replicate each terabyte (plus metadata, etc.) in three different locations, we can replicate a fraction of that in each location — about 50 percent — based on usage and other factors. That’s how 3 TB becomes 1.7.

Reducing the storage capacity you need is just one of the benefits of a STaaS solution. In this era of continually growing unstructured data, though, it presents a real opportunity to save your company money, while providing unlimited scalability.

To learn more about how KeyInfo’s STaaS can help your organization, click here.

Christopher Ticknor
Director of Marketing
Key Information Systems