Blogs

Start the Clock: How to Get an Active Archive in Less than an Hour

By David Cerf

More than 50 years ago, IBM and General Electric were grappling with computing lab’s needs to process more than one computer function at a time. The pioneers of virtual machines needed a way to share mainframe resources with disparate users. Today, you can hardly glance down a list of IT news without seeing “virtual machine” or “virtualization.” Businesses figured out it was much easier to be able to share resources for various functions instead of having separate, physical systems for every different process.

Why a VM-enabled Active Archive?

Scaling storage typically means adding another server to your rack, filling it up and the cycle continues. For businesses without the luxury of real estate, another rack might not be possible – you can only stack so high.

With a VM-archive, you can use a few GBs of space on an existing server plus a tape library or even the cloud to build a scalable active archive. This can be especially vital for businesses with multiple sites or multi-tenant architectures. The VM archive should be able to manage both file and object-based storage to support massively scalable workloads.

Plus, we like that instant gratification. Download, install and start writing data in minutes, not days. You aren’t stuck waiting for a box to ship to you, and there’s no additional hardware maintenance needed.  

Upgrade, but don’t Overhaul

If you’ve ever done any remodeling, you’ll know that it’s a lot less expensive to repaint the rooms and replace appliances rather than tear down the whole house and start from scratch. The same is true for deploying a VM active archive. Adding a virtual machine means you don’t have to mount new hardware or send existing servers to their deathbeds. Instead, re-use and repurpose to save money and simplify getting started with your archive project.

A VM-enabled active archive allows you to start small and expand the archive as you need. This way, an archiving project is easier to get started with and tackle as data continues to grow.

Let’s look at a few ways companies are leveraging VM-enabled archiving today:

  • Got tape libraries? Add a VM archive to existing library and instantly create shared storage for nearline and archive.
  • No local tape? No problem. A VM archive can keep active data locally and automatically create copies for off-site storage or the cloud for disaster recovery and data preservation.
  • Looking to simplify multi-site storage management? A VM archive can provide on-site storage that automatically replicates to a second data center, the cloud, and multi-tenant architectures.

Our friend, Jon Toigo, has been exploring this concept in his latest video installment with Barry M. Ferrite, AI. Check out some of his videos here for more information on VM archiving.

Top 7 Data Storage and Active Archive Predictions for 2016

Active archives have become a best practice for organizations that need to store and access large volumes of data. Stemming from recent technology advancements in the storage industry that have led to a variety of approaches to creating and implementing an active archive, more organizations are now benefitting from reliable access to all of their data all of the time.

Members of the Active Archive Alliance recently shared their predictions for data storage as it relates to active archives in 2016.  Here is a list of the Top 7 active archive trends to watch:

The Move to Hybrid Cloud

Organizations are increasingly integrating the private cloud within their computing architectures to form stable and efficient hybrid cloud systems, which helps mitigate the limited capabilities and risks of public clouds. Cloud data centers are targeted and intermittently attacked by malware and hackers, making them one of the most threatened data centers. With the hybrid cloud, organizations can effortlessly adjust their public cloud resources to accommodate changes while they maintaining sensitive information within their private infrastructures.

Increased Prevalence of Active Archives
The declining price of hard disk drives and tape systems are making storage options more affordable and will allow for more organizations to implement archive actives. This coincides with a desire for better business insight with analytics using larger data sets and modeling to accelerate time to results. Data is becoming more valuable as the use of analytics increases and businesses want to use historical data to make better decisions. Active archiving with unified data across multiple tiers of storage allows businesses faster access to data, better insights and the ability to make more informed decisions. 

Data Centers Become More Empowered

Data centers will become competitive forces used to drive business advantage because information is a major asset in today’s modern global economy. Organizations will see the huge benefit that comes from monetizing archived data storage by making most data storage active, retaining storage for longer cycles, which is the basic premise and benefit of active archiving and using scale-out storage.  The Media & Entertainment industry, and Life Sciences & Genomics are among the uses cases where historical information suddenly can turn into a hot business asset.

Seamless Storage Management Regardless of Technology

New software abstracts flash, disk, tape and cloud into simple, easy to use storage that works within current user behavior (no special or custom integration required). This intelligence will blur the line between performance storage and cost effective capacity storage and it will make tape and cloud as easy to use as the current c: drive.  It will also reduce back up or even eliminate backup for fixed content. Archive copies will become the new standard and intelligent storage management will automatically provide data protection (numbers of copies, self-healing function) both for local and off-site copies. And it has the intelligence to know when to store the data in flash for performance and in tape/cloud for resiliency and cost-effective storage. 

An End to Vendor Lock-in

There will be a move away from proprietary solutions that create "vendor lock-in" that lock up user data with vendor dependency, i.e. silo solutions or proprietary software and hardware. As we keep data longer or even forever, we will need solutions that are flexible, vendor neutral, support completely open formats and ensure that data can be accessible now and in the future. 

Expanded Role for Advanced Data Tape in Active Archive Environments

With organizations seeking to keep access to all of their data and content indefinitely, new innovations in tape technology will make this possible and affordable in an active archive environment. Increased capacity coming from LTO-7 now at 15.0 TB compressed will help reduce TCO and boost performance with a transfer speed of 750 MB per second. The newly extended LTO roadmap to generation 10 will allow organizations to leverage investments already made in LTO systems and continue to migrate archived content well into the future. Finally, tape’s role will continue to grow as a seamless part of the storage infrastructure and the cloud as it becomes easier to use as a file and object storage solution due to LTFS.

Increased Use of Object Storage
Companies will increasingly expand their active archives using object-based storage over file or block storage. Object storage systems allow relatively inexpensive, scalable and durable retention of massive amounts of unstructured data. With object storage, there is no file system hierarchy. The architecture of the platform allows the data pool to scale virtually to an unlimited size, while keeping the system simple to manage. The efficiency of object storage makes massively scalable active data archives affordable. 

In addition, more organizations will adopt S3 as the de-facto standard cloud storage interface for managing large amounts of data in active archives. As organizations increasingly need to simplify and accelerate storage and retrieval for large amounts of data, extracting knowledge and information from historical deep archives as quickly and painlessly as possible, they will need a cost-effective and widely compatible object storage interface.

Active archive technologies are continuing to evolve and provide significant advantages to organizations desiring online data archives and the ability to quickly capitalize on the value inherent in their stored data – and with the trends we are seeing today 2016 should be another stellar year for active archiving.

The following Active Archive Alliance members contributed to this list: Crossroads Systems, Inc., DataDirect Networks Inc., HGST, Fujifilm Recording Media, Inc. and Spectra Logic Corp.

Should You Be Archiving? Here’s How to Know for Sure

By David Cerf

Active archiving is centered on the convergence of various storage technologies to create a balanced, accessible and affordable method for storing data long term. But, what if you don’t know how much of your data belongs in this “archive” category?  The truth is: the seemingly easiest method of use what you have, fill it up and buy more gets unnecessarily expensive. This is especially true if you’re using a high-performance storage array for data that doesn’t need those performance capabilities.  

While each company may define archive differently, here are a few common criteria for “archival” data:

  • Data is not accessed regularly
  • Data was created more than 1 year ago and has not been accessed in more than 90 days
  • Data has not been modified or accessed in more than 90 days

Sometimes companies simply don’t know how much data they have that should be archived. Fortunately, there are free tools like the storage assessment at www.freemystorage.com that can help you answer this question by showing dynamic reports  and transparent views into the state of your storage environments.  

With detailed results on the active state of storage and the file make-up of active/inactive data,   assessment tools can help organizations understand how to free up their storage and avoid unnecessary upgrades and over-provisioning. It can help organizations save thousands of dollars each year and reduce operating and capital expenses.  

Why should end users think differently about their storage?

The Active Archive Alliance thinks there is a better option to meet the growing demand for storage. Data is growing faster than budgets, driving a need for more cost-effective storage and protection. While high-performance storage keeps line-of-business data and applications quickly available, up to 80% of data on primary storage is often inactive and doesn’t need to claim expensive storage capacity. Storage needs to be intelligently balanced between active and inactive data, on-site and off, and must meet both performance requirements and budgets. Active archiving can deliver intelligent storage management that combines tuneable, user-defined policies for capacity optimization (no more over-provisioning) and will:

  • Simplify nearline and archival storage management
  • Free up storage capacity 
  • Reduce backup
  • Simplify data protection
  • Protect long-term content
  • Improve storage costs by over 50%

Discover the benefits of reduced primary storage cost with automated data movement into an active archive solution that best fits your needs. Be sure to choose one that can transparently move files to your archive from primary storage like NetApp, Windows, Isilon and any CIFS/NFS or object storage.

Get started at FreeMyStorage.com for a free assessment to help you take control of your storage. 

The Impact of Object Storage on Active Archiving and on Tape Usage

By Rich Gadomski

Object storage has made great strides in the decade since EMC released its Centera system. Since then, numerous other vendors have brought their own object storage systems to the market. Object storage is great for storing unstructured data, since it separates the metadata from the data so the storage system isn’t dependent upon the particular file system or block storage structure. Additionally, administrators do not have to worry about matters such as setting RAID levels or building and managing logical volumes.  Lastly, from an integration perspective, object storage is a good platform for archiving because it is massively scalable, cost effective, and is able to act as a cloud infrastructure for collaboration.

However, there was one major problem in using object storage for archiving – at least until recently.

“You can't do it -- get object storage taped, I mean,” wrote Chris Mellor, storage editor at The Register in March 2012. “There is no way to get the contents of an object storage system onto tape. Instead, it has to stay on spinning disk forever.”

And since it had to stay on spinning disk, this meant continually buying more storage arrays, as well as laying out all the support, networking, licensing, power and cooling needed to keep those disks spinning.

“As the amount of data to be stored grows and grows, tape will become the lowest-cost option,” wrote Mellor. “For high-volume data archive capacities, disk economics suck, and it’s no use pretending data deduplication and thin provisioning can change that. … What is needed is a way to drain off cold, inactive objects from disk and stuff them into a tape archive. Isn't it obvious?”

Well, three years is a long time in IT, and apparently, tape storage vendors did think it was obvious that they should support object storage. A year after Mellor wrote his plea, various storage vendors began releasing tape systems that could store objects including many from the Active Archive Alliance that have released tools to make object storage feasible in a tape environment.

A good example of this is Fujifilm’s Dternity NAS which allows for both file and object storage on LTFS tape media in its active archive solution. By utilizing a standard S3 interface with an underlying RESTful API, cloud storage users can connect to Dternity directly without needing to program special calls or APIs. Active archives managed by Dternity NAS are easily accessible by CIFS/NFS or S3.

The bottom line is that tape is a viable place for object storage. This opens the door to massively scalable object stores comprising billions of graphical images, for example. Not only is it possible to achieve this, but doing it on tape, which recently demonstrated 220 TBs on Barium Ferrite media, means that it can now happen in a cost effective manner. 

Activating an Active Archive with MAM

By David Miller

The flood of information into modern organizations will not ebb any time soon. Connected workers and connected devices link staff and customers 24/7. Smart phones and tablets make producing, sharing, and consuming rich content easier than ever. Professional camera formats have increased in resolution from SD to HD to 4K and beyond. 

In the absence of a better method, networks, storage, and processors will be tied up transferring and transforming this content from office to office, country to country, between organizations and their audiences. The obvious challenge for any organization trying to stay afloat is optimize storage and infrastructure to control costs.   

The end game for cost control is to have content with the fewest number of copies stored in the repository, along with the greatest long term security and lowest ongoing operating costs:  an LTO active archive system. Depending on the economies of scale for a centralized archive, it may be cheaper to have localized archive pools near the locations where the content is generated and commonly used.  

The challenge? Linking the archives together and providing universal access.

The obvious challenges to maximizing the use of archived storage is the ease of use of the localized/desktop system, the demanding nature of non-linear video editing systems and users, and similar system for rapid access to large video files with low latency, and the difficulty or inefficiency/laziness for moving current content to the archive. Localized primary storage is the default for the infrastructure and mindset of most users, so any active archive must have intuitive access to the content in order to be successful. Proprietary storage systems for collaborative video editing are very expensive, but editors and IT staff are loathe to archive content for fear of taking media for other current productions offline. Finally, time challenged staff are unlikely to move content from primary storage to archive without easy tools and systems to do so.

Enter the management system.  

Storing proxy copies with rapid and universal access from a simple interface (a la You Tube) with simple archive and restore functionalities allows organizations to maximize the benefit of their active archive.

  1. A desktop browser interface and/or tablet interface allows users immediate access with minimal IT support. They can maximize productivity if they can search and re-task all of the content in the centralized archive.
  2. If the system can track all of the media in editing projects, then all of the elements in finished products can be archived while those elements still in use in other productions can be maintained in the editing storage.
  3. Simple archive and restore tools for individual digital assets or entire groups will allow users to quickly and efficiently archive content. Automated processes can move long unused content into the archive while maintaining access for preview, collaboration and restore functionalities if needed.

 

Further advantages can be accrued if the federated search and browse can allow access and control over archived content in multiple locations. Content can be archived near where it is produced or where it is most likely to be needed in the future. This saves time and money for moving content to the centralized archive and then moving it back to a different location when needed.

Nimble access to content is the key to enjoying the benefits of an active archive system.

How to Avoid 6 Costly Cloud Storage Snafus with a High-Powered Active Archive

By, David Cerf

Storage requirements are growing at 40% year over year, and that’s a top concern for 1 in 4 CTOs. It's nearly impossible to talk about modernizing IT without mentioning the cloud. Among users, cloud storage solutions are as ubiquitous as iPhones, but for enterprise IT, the cloud presents unique challenges. Relying on pay-for-play cloud services can send costs skyrocketing. Sure, the cost for cloud storage is cheap, but the service fees pile on as the need to access data arises. Shouldn’t your data remain just that – your data?

Instead of letting cloud services evaporate your IT budget, here are six ways to avoid the risk, high costs and complexity of typical cloud storage by deploying an active archive instead.

  1. Know the true costs. Not all storage is alike. Cloud storage includes the cost per GB as well as transfer fees, access fees and other fees to get your data back. Hybrid active archive solutions can slash costs by 70%. With no hidden charges, it's easy to calculate costs and meet budgets today and in the future.
  2. Choose non-proprietary solutions. Make sure that you can get your data securely whenever you need it. Technology continues to evolve as do your business needs. Pick a vendor that does not lock up your data with proprietary systems. Active archive solutions that use LTFS tape can give you peace of mind in knowing that your data is always readable and recoverable.
  3. Ensure your data is safe and secure. Don't be the next company making headlines for the wrong reasons. By maintaining a clear chain-of-custody, you'll bypass painfully long retrieval times associated with bandwidth limits. Hybrid active archive solutions can provide storage onsite and offsite to seamlessly work with your current applications and processes.
  4. Read the fine print. Check your vendor's SLA and make sure you understand how your data is protected, secure and accessible.
  5. Simple is always better – Occam's razor. Your solution needs the intelligence to ensure that your data never changes or goes missing. An easy-to-use web interface for accessing data is also key.
  6. Leverage what you already own. Gateway technologies in combination with your existing storage can deliver rapid ROI. Look for vendors that can deliver online, nearline and archive in a seamless solution.

Many active archive solutions have incorporated S3 interfaces, allowing these solutions to serve as a simple target for offloading data from a cloud service such as Amazon Simple Storage Services. Because many of these active archive solutions employ hybrid storage architectures with technologies like LTFS tape, they can be intrinsically protected. Instead of offloading to disk alone, users can leverage a system that will provide automatic data protection at the most economical cost per gigabyte available.

Before you consider “inexpensive” cloud storage for your data preservation, be sure to check out the offerings from active archive solutions. The ability to take new technology combined with familiar storage methods can be even more effective than simply scuttling data storage off to a cloud service provider where you might lose access, risk security, or pay more than you should. 

How to Avoid 6 Costly Cloud Storage Snafus with a High-Powered Active Archiv

Blog for Active Archive Alliance
By, David Cerf

Storage requirements are growing at 40% year over year, and that’s a top concern for 1 in 4 CTOs. It's nearly impossible to talk about modernizing IT without mentioning the cloud. Among users, cloud storage solutions are as ubiquitous as iPhones, but for enterprise IT, the cloud presents unique challenges. Relying on pay-for-play cloud services can send costs skyrocketing. Sure, the cost for cloud storage is cheap, but the service fees pile on as the need to access data arises. Shouldn’t your data remain just that – your data?

Instead of letting cloud services evaporate your IT budget, here are six ways to avoid the risk, high costs and complexity of typical cloud storage by deploying an active archive instead.

  1. Know the true costs. Not all storage is alike. Cloud storage includes the cost per GB as well as transfer fees, access fees and other fees to get your data back. Hybrid active archive solutions can slash costs by 70%. With no hidden charges, it's easy to calculate costs and meet budgets today and in the future.
  2. Choose non-proprietary solutions. Make sure that you can get your data securely whenever you need it. Technology continues to evolve as do your business needs. Pick a vendor that does not lock up your data with proprietary systems. Active archive solutions that use LTFS tape can give you peace of mind in knowing that your data is always readable and recoverable.
  3. Ensure your data is safe and secure. Don't be the next company making headlines for the wrong reasons. By maintaining a clear chain-of-custody, you'll bypass painfully long retrieval times associated with bandwidth limits. Hybrid active archive solutions can provide storage onsite and offsite to seamlessly work with your current applications and processes.
  4. Read the fine print. Check your vendor's SLA and make sure you understand how your data is protected, secure and accessible.
  5. Simple is always better – Occam's razor. Your solution needs the intelligence to ensure that your data never changes or goes missing. An easy-to-use web interface for accessing data is also key.
  6. Leverage what you already own. Gateway technologies in combination with your existing storage can deliver rapid ROI. Look for vendors that can deliver online, nearline and archive in a seamless solution.

Many active archive solutions have incorporated S3 interfaces, allowing these solutions to serve as a simple target for offloading data from a cloud service such as Amazon Simple Storage Services. Because many of these active archive solutions employ hybrid storage architectures with technologies like LTFS tape, they can be intrinsically protected. Instead of offloading to disk alone, users can leverage a system that will provide automatic data protection at the most economical cost per gigabyte available.

Before you consider “inexpensive” cloud storage for your data preservation, be sure to check out the offerings from active archive solutions. The ability to take new technology combined with familiar storage methods can be even more effective than simply scuttling data storage off to a cloud service provider where you might lose access, risk security, or pay more than you should. 

The Decision Tree for Archiving Data

by Dave Thomson

Many users understand they have a need to archive data for compliance reasons, to improve data preservation or to reduce storage costs. Beyond these base user requirements, you have your own specific environment that an archive 
must support. For example:

  • Total capacity
  • New capacity per day
  • Smallest/average/largest file size
  • Average file age and last retrievals dates
  • Estimated retrievals per day (and type of retrieves – single file or file sets)
  • Existing archived data (technology and formats used)
  • Redundancy requirements
  • Plan for archive migration

Follow our decision tree for data archiving
In all circumstances, we follow a decision tree that provides you with the best and most economical solution for your individual circumstances. A variety of data storage technologies are available including tape, disk, object storage and cloud storage.

This blog was excerpted from Around the Storage Block. To read the article in its entirety, click here.

Back to Basics

David Thomson – SVP Sales and Marketing – QStar Technologies

It is five years ago that a group of companies came together to form the Active Archive Alliance. That group agreed that on a regular basis the term “archive” was being misused, often to represent retaining backups for long periods of time. We saw archive in a different way, as a separate process to backup to secure non-changing data, making it available to the user or application that created it.

Today it seems that although many organizations understand this message, many more do not. I am still perplexed when IT staff fail to understand the significant advantages of using active archive technology. This inspired me to write this blog and to restate the benefits of using active archive, and what it means in 2015.

How much data within an organization is static or unchanging? For many organizations, it is a significant percentage, and there are simple, sometimes free, tools to help users understand how much data we are talking about. 

We do not archive changing / evolving data, this data is secured using RAID, replication, snapshots and backup, all of which are expensive and possibly time-consuming tasks. Most of the time involved is dedicated to ensuring that the processes are working correctly and that if something fails data is recoverable and not lost.

Archiving is about securing unchanging data in a different way. As data is ingested into the archive it is written to multiple places or media. Should one site or media fail there should always be second and sometimes third places to access the data from. This could be an automatic switch to a second repository or require manual intervention; this choice is left to the organization based on their budgets and minimum response times.

By relocating significant amounts of data that is unchanging into an archive environment, primary data sets of constantly changing information can be more easily and cost effectively protected. Backup windows are reduced, the replicated capacities are reduced and the frequency of snapshots could be increased.

Active archives can be as fast or slow, as expensive or low-cost as an organization needs.  You are not forced to use tape libraries, although many organizations do, due to their low total cost of ownership. Many active archives use SSD, disk, optical and/or cloud to store and secure data. It all depends on the individual requirements of the organization and the static data they are archiving.

If architected correctly, active archives can benefit the entire organization by categorizing data and protecting it using the most economical methods for that data type.

How to Implement an Active Archive for HPC Research

By Eric Polet

The world of high performance computing requires ever-present data accessibility, along with scalable capacity. The data management required for computational and data-intensive methods in a high-performance infrastructure can create unique challenges for those tasked with maintaining data and ensuring its long-term veracity and availability.  Research and high performance computing (HPC) sites face the challenge of retaining the ever-growing amount of data being generated by employees and computers. The data’s value expands far beyond what can be gained from it today.  To deem that data useful, information needs to be kept for decades for reexamination when future advancements are achieved.

HPC requires active archive solutions that are, among many things, reliable, scalable, cost-effective, and energy efficient. As data volumes grow, its imperative new solutions can sustain the organization’s anticipated data growth while seamlessly replacing other legacy equipment. The National Computational Infrastructure (NCI), home of the Southern Hemisphere’s fastest supercomputer and Australia’s highest performance research cloud, was facing this data growth problem.  NCI’s supercomputer supports more than 5PB of data that must be backed up and archived. NCI was faced with significant forecasted growth and wanted to implement an updated, single archive solution. This goal was achieved with an active archive solution created by Spectra Logic and SGI.

How did they do it?

NCI selected an active archive approach to manage their data, which has proven to operate quite flawlessly. Active archive solutions turn offline archive into visible, accessible extensions of online storage systems, enabling fast and easy access to archived data. “The incorporation of an active archive solution provides a platform for storage growth,” said NCI associate director Allan Williams, “It allows us to keep our primary data online and accessible to users, while also increasing the reliability of our stored data across physical sites.” The organization is able to easily scale their storage solution as their data continues to expand due to NCI’s depth of engagement with research communities and organizations. Some of the key features gained by the implementation of NCI’s active archive solution are:

• Extreme scalability
• Intelligent data management
• High data reliability
• Portable data storage solution
• Low cost per terabyte
• Reduction in energy costs and space
• Performance and uptime

HPC organizations that need a scalable storage solution are faced with a number of difficult decisions on how to store and archive their data.  Important factors to consider when selecting an archive solution include scalability, data reliability, and affordability.  Active archive’s intelligent data management framework provides organizations file level access to data at a significant reduction of cost.  When NCI introduced its active archive solution they were provided with a dense, high capacity storage solution for its cloud installation with significant economies of scale and data integrity safeguards.  By selecting an active archive solution, NCI has created a long-lasting and reliable storage solution for the country’s largest supercomputer.