For many years during the “tape wars” era, as I’ve come to call them, when most major non-tape vendors were attacking the technology, companies like Spectra Logic often found themselves on the defense. If you follow any of the dialogues on LinkedIn or other forums, there has been a common theme that the only value proposition tape offers is cost. There is also opposition to this opinion, thus creating what Chris Mellor at The Register describes as a “religious war” between technologies. Instead of firing another arrow amidst that war, I’d like to take a step back and take a look at why active archives are resonating so well with resellers and customers alike: Flexibility, performance, affordability and ease of use.
Active archives combine the best advantages of many technologies, which is why software, tape and disk vendors alike are joining the growing movement. With the data volume and retention requirements of most archives, tape technologies provide some key benefits and are one of the big reasons why active archives are so appealing, however tape is only one piece of a well architected active archive. Historically, archives and especially archives to tape, have had a reputation of being hard to use, cumbersome, unreliable: in other words a headache. Today, archives to tape are the healer of headaches – not the creator. These issues are not inherent to tape; rather, they are symptoms of problems that customers need to address. The words data migration, media format changes, full restores, lost data, and backup failures all evoke negative connotations at best. The true culprit of these pains is the data management process, or lack thereof, which active archives address with both short and long-term solutions.
When it comes to infrastructure architecture, flexibility and performance are king with cost as the regulator. These are the benefits that an active archive delivers, by offering a new approach to data management, rather than simply an updated single product with new features. Active archives take the approach of offering storage and archival features that can be tailored to the specific needs of individuals, ensuring the short-term storage and long-term retention needs specific to that organization’s data are met. This is because active archive is not a single product being promoted by a single vendor or even a single market.
Active Archives as NAS
It’s understandable that people initially mistake active archive for storage tiering or HSM…however, it’s much more than that. Active archives allow any storage medium to be used as NAS storage, in the form of a CIF or NFS share. When combined with open formats, it allows a company to architect its systems in a vendor-agnostic way, allowing the use of the most appropriate product for its specific needs. Migration is no longer a major undertaking, but simply a hardware upgrade and adjustment of policies. If a technology becomes obsolete, the data can be easily migrated to a new system. If a company fails, likewise the data is not compromised. And much to everyone’s relief, as technology evolves, data can be automatically moved onto new platforms. This prevents the nightmare of realizing that you have large amounts of data sitting on obsolete equipment or formats, because active archives proactively migrate data onto newer equipment as the system changes over time.
Performance seems to be where we all get snagged. Is tape slow, or is it fast? Random access, linear access--there are many ways to represent any technology as fast or slow. Active archive properly sidesteps the I/O battle. It simply takes advantage of the equipment implemented and the policies for where and how data is retained. Regardless of where data resides, it is accessible. This doesn’t mean that you should move transactional, high performance data immediately to tape. It means you can set realistic expectations for data retrieval times, and at no point does an IT administrator have to manually restore data to get it back, provided it is at least in a library or connected via a WAN. The performance of the system is left to the storage devices implemented and the policies around the data management application. SSD and high performance disk should be a part of a well-designed active archive to meet performance needs.
With performance and flexibility addressed, we move on to cost and ease of use. The active archive advocates, of whom I am one, have hit hard on the cost advantages of an active archive. For today, I’ll briefly note that an active archive is simply less expensive than other strategies, both in capital expense and operational expense, because it uses less expensive storage platforms for data that would traditionally have resided on higher cost systems to maintain accessibility.
All that remains is ease of use. To the user, active archive is an automated system; all files are accessible; older or archived data simply takes a little longer to retrieve. From the administrator’s perspective, active archives need to be properly set up and tuned to optimize performance for the environment. However, once configured, the administrator is not needed for file retrieval, and can easily set up a migration without having to take the system down or spend weeks on planning. Also, in a disaster, data is accessible over a WAN if it’s stored on a live DR site, or can be rebuilt from offsite tapes. In the event that a single system goes down, retrieval is simply dependent on the performance of the device that holds the second copy.
IT administrators can manage their data more easily and in less time, instead of allowing their data problems to manage them. Components can be upgraded or replaced without overhauling the entire active archive. Thus, active archives are flexible, perform well, are cost effective, and are much easier to administer than other storage strategies—no outsourcing required.