Subscribe to Active Archive Alliance updates - blog posts, newsletters, and more!
Archive means different things to different users. What's clear is that Archival is not a specific workload, rather a category of workloads with the common trait of preserving data reliably for extended periods of time. Requirements can change from application to application such as sensitivities to cost and data integrity, but the requirement for scalability, low TCO, and longevity are universal.
Over the last decade, architectures such as object storage, HSM, and scale-out NAS systems have been deployed to serve primarily two archival use-cases; compliance, and preservation/reuse. In the compliance use-case, archival ROI was associated with risk mitigation; that is to say saving the digital communications and records of an enterprise, reduced litigation costs or risks associated with regulatory non-compliance. Archives deployed for preservation and reuse are typically associated with digital assets that have license or cultural value. The most notable examples are the multi-Petabyte archives found at major broadcasters, movie studios, and digital libraries where challenges of scale and longevity can approach extremes.
Active archiving is a key use case for tape. But how does active archive differ from backup? And how can tape-based active archive solutions help you reduce costs, save time and reduce risk?
Knowing Your Backups from Your Active Archives
It’s important to understand the distinction between backup and active archive strategies. Active archive and backup applications are distinct processes with different objectives and therefore impose different requirements on the storage systems that they utilize.
Let’s face it: primary storage vendors love anyone who will keep infrequently accessed data on primary storage. These customers are like money in the bank, and the last thing a primary storage vendor wants is for those customers to wise up and break the chains that bind enterprises to their current storage investment model.
New or Modified Data Needs to be Protected Quickly
When data is created or modified by some business process or production workflow, it is an IT manager’s imperative to ensure that work is not lost. Today’s common practice is to design data flows to include processes that back up this new data as quickly as possible. It’s typical to back up the data locally to enable a quick restore and create an off-site copy to protect the data in the event of a site disaster such as a flood, tornado, earthquake or fire. This practice requires that there be enough storage to back up the original work.
The need for storage continues to grow at an exponential rate. New ways to mine data for valuable gold nuggets of customer information, buying habits, etc., are creating incentives for organizations to save data forever, where in days past, it would have been discarded. The combination of these effects is creating a difficult situation for IT managers who are trying to balance the explosive growth of data with flat to shrinking budgets. A key question for managers is: How do I simply manage all the data, from very active files that are used daily to files that have not been touched in months, yet have critical data for future mining expeditions?
2013 has been a breakthrough year for active archive deployments, and with shrinking IT budgets, exponential data growth, and longer retention periods, there are no signs of slowing in 2014. In fact, the current climate is driving the need for more cost-effective long-term storage solutions more than ever. The market has matured to a point where accessibility and performance are essential to long-term data storage, making 2014 the year active archives will become a more mainstream best practice.
Active archiving is getting a lot of attention these days as end users seek to relieve the tremendous pressure on primary storage from relentless growth of unstructured data. Like the pain from a toothache, they just want it to go away. The solution lies in extending the existing file system with active archive software across all storage pools and moving seldom accessed or totally dormant files from primary storage to a more cost-effective economy tier such as tape.
I promised the Active Archive Alliance team a blog on my experiences of selling Linear Tape File System (LTFS) solutions a year after the process began. I am a little late – as QStar started selling LTFS in August 2012. For those people who have not heard of LTFS, it is an industry standard file system for tape that provides media portability between operating systems and software vendors. It stores the file metadata in a separate partition on the tape media, allowing it to be “self-describing,” so when you import the media into an LTFS system, the data can be read just like a USB flash drive.
Last month, you probably saw the deluge of articles on “the cloud’s worst nightmare” as news broke that Nirvanix, a cloud provider, was closing its doors and pulling the plug from its data centers. What complicated the situation for service providers and customers who stored data in the Nirvanix public cloud was that they only had a few weeks to move hundreds of terabytes elsewhere.
In 2010, the Active Archive Alliance was founded with the goal of aligning the education and technologies needed to meet the rapidly evolving requirements for data management in a digital age. Archives as a static repository disconnected from transactional storage has been demonstrated to be no longer the only way to deal with burgeoning data growth. Instead, the goal was to build a consortium of expert storage vendors to help introduce and guide customers through this expanding market called active archiving as well as provide solutions and services to assist with its implementation.
Being a member of the Active Archive Alliance, and being at a company that offers a full range of data protection storage offerings, I get involved in many discussions surrounding the myriad strategies for protecting and retaining data. Historically these discussions have focused on which storage and data mover technologies offer the best fit for the tiers of storage a user has already identified they need. But more often than not, today’s conversations appear to be taking on a new color.