Dave Thomson's blog

The Never-ending Question

It still never fails to shock me when I speak to a customer who says, they do not understand the difference between backup and archiving. 

Backups are periodically taken “snapshots” of active data; data that needs to be on fast primary storage because of its usage patterns, to provide a recovery system for deleted, lost or corrupt data. Most backups are retained for short periods of time. I have heard backup described as a short-term insurance policy for a company’s most often used information.

Archived data, by definition, does not require backup. If you purchase an archive storage technology that requires backup, it is NOT an archive. An archive is a separate store of data outside the “primary plus backup” world. It consists of those files that are accessed less frequently and are not changing, but have value to an organization and must be preserved. Sometimes this can be a considerable percentage of the organization’s total data.

Preserve Data for Decades with an Active Archive

QStar Technologies, along with many of the Active Archive Alliance members were at NAB (National Association of Broadcasters) Show in Las Vegas a few weeks ago. 

I had the pleasure of speaking with a number of users in the Media and Entertainment world, some small and some very large. Almost universally they expressed the need to preserve their digital content for years and decades, yet all are looking for a “migration free” archive, which does not exist. All technology has a useful life, after which it becomes prohibitively expensive to continue operating. The objective is to migrate to a new technology before this occurs, so designing in migration is imperative when creating an archive.

Blurring the Lines

Recently there have been a number of product announcements from backup vendors stating the virtues of using backup applications for archiving. One of the main reasons the Active Archive Alliance was formed was to better educate organizations about archiving and to explain why an Active Archive is superior to backup for archiving.

Let’s look at the ways an Active Archive secures data for medium- to long-term retention periods, compared to using a backup product for archiving. 

File Systems for Tape Libraries

 

The tape industry has finally caught up with the optical storage industry with the addition of LTFS (Linear Tape File System). The optical storage market has had its own Industry Standard file system (UDF) since 1995. Both standards were originally designed for stand-alone drives rather than libraries.

Our goals, within the Active Archive Alliance, are to ensure inter-compatibility, define best practices and to educate end users. This is why we felt it was important to discuss the advantages and disadvantages that LTFS presents in a library environment:

Advantages:

  • Data interchange is simplified and eliminates vendor lock-in, an important step to inter-compatibility
  • Index information is stored on each media and can optionally be stored on hard disk for faster searches
  • Data can be ‘actively’ shared with users on the network via CIFS or NFS
  • In a drive environment, data can be accessed and managed directly

Disadvantages:

  • Each cartridge is a single entity with an uncompressed capacity of 1.5TB with LTO-5 drives, so finding data in larger archive environments becomes a challenge
  • Space on the media is not optimized and capacity is likely to be left at the end of each media; with large file sizes this could pose a significant challenge
  • Only available for LTO-5 drives and later generations
  • Limited library support at present
  • Limited operating systems supported

So will LTFS for tape libraries be successful?    Absolutely. 

Many archiving software manufacturers are announcing solutions that include LTFS. QStar, for example, will be supporting LTFS in a phased roll out starting next month (Aug). All LTO-5 tape libraries in a Windows, Linux, UNIX or Mac environment will be supported by the end of 2011. This will include the aggregation of media, making multiple media appear to be one large subdirectory, improving workflow in large archive installations. Indices and metadata will be retained on hard disk for easier search and retrieval of archived content.

LTFS is essential to creating an active archive environment, supporting medium to long-term archiving. It will enable organizations to switch to new archive hardware and software manufacturers without costly migrations. Also, where media is physically removed, the Disaster Recovery (DR) copy can be accessed at the remote site using a stand-alone drive, eliminating the need to transport the media back to the primary location.

Providing an industry standard file system for tape, which can be successfully used in a library environment, has been long overdue.

Feedback from Tape Summit

We had a long, but very rewarding few days last week. Many archiving companies were at NAB (National Association of Broadcasters) in Las Vegas where nearly 93,000 visitors explored the latest and greatest in the Media and Entertainment world, a figure that was 5,000 visitors higher than in 2010. The most influential trend in this area for an archiving company like QStar, was the transition to 3D TV, which will double the capacity used in creating standard HD movies.  

Immediately following NAB, many tape related archiving companies moved to the other side of town for a new event—The Tape Summit, hosted by ExecEvent. The Tape Summit provided an opportunity for hardware and software companies to discuss recent trends in tape archiving with leading storage analysts.

The Active Archive Alliance was well represented. Members Atempo, FileTek, QStar and Spectra Logic were there to champion the message of adding tape as an active archive layer. Other tape-centric organizations were also present; BDT, Gresham, Hewlett Packard, IBM and Quantum.

From the analyst community was (in no particular order); Curtis Preston, Fred Moore, Sheila Childs, Bill Mottram, David Hill, George Crump, Deni Connor, Chris Mellor, Stephen Foskett, Jon Hickman, Pete Conley, Jerome Wendt, Patrick Corrigan, Eric Slack, Rich Castagna, Kim Borg, Mark Brownstein and Yuri Spiro.

Greg Duplessie and his team created a fun and informative two days, with speaker slots, one-on-ones and group discussion. I particularly found the user's perspective provided by Jonathan Marianu of AT&T enlightening. Jonathan outlined a number of scenarios for data protection that he had reviewed and discounted for various reasons, finishing with their adopted solution. Due to the large capacities that they are protecting, they use a combination of data de-duplicated disk and tape libraries, with offsite tape storage using an offsite media storage company. On top of the technical issues, he also gave us a view of how difficult it can be to use removable media to the levels that AT&T does. When using thousands of pieces of media weekly, even un-shrink wrapping media becomes a significant overhead.

The largest topic of conversation was around LTFS (Linear Tape File System), the new Industry Standard File system for LTO5. This is a self-describing file system using a primary partition to store the contents of the media and a secondary partition for the data. It has been immediately accepted by the archiving software industry as a significant step forward in the continuing dominance of tape as a long-term archiving format.

However, so far, there has been little discussion from the backup software industry…even from those vendors backing LTFS.  Perhaps they are all happy with their own proprietary formats (and vendor lock-in).

One significant no-show was Oracle. It would have been interesting to hear from them about their new T10000 C tape product as well IBM on their 3592. As the two vendors that offer both proprietary and industry standard (LTO) formats it would have been useful to hear and contrast how they see the tape market developing from both a backup and archive perspective.

I look forward to next year's event and hope to see a slight increase in the numbers present – although the smaller group size did lead to a more frank exchange of ideas.