24
Wed, Apr
0 New Articles

Rethinking Data Protection

High Availability / Disaster Recovery
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times
Can your storage provider guarantee that your data won't be lost?

 

Lose data, lose your job. Since 93 percent of data-intensive businesses that lose their data go out of business, it's not surprising that data loss is such a big career killer. And it's only natural that we have built a massive infrastructure around preserving data at all costs: raid arrays and tape libraries and snapshot file systems and offsite storage and online storage.... So massive, in fact, that it is collapsing under its own weight.

Our storage architectures today reflect a fundamental reality of yesterday, when all data was incredibly valuable. To illustrate, when I started computing in 1973, it was on a system with 32K of memory that cost around $100,000. At these prices, which in those days were considered great bargains, companies could only afford to store massively valuable data. Today, that kind of money buys terabytes of fast memory and exabytes of disk space. And just as the acid test of writing in the physical world is that it is "worth the paper it is printed on," it makes sense in the digital world to store only data that is worth more than the cost of the disk space it is stored on plus all of the other overhead costs. These costs include systems to hold the disks, administrators to manage the systems, and support and backup infrastructure for all of them. 

 

On the other hand, to fail to save any data that is more valuable than the cost of storing it is throwing away money. In summary, if its value per byte is bigger than the cost per byte of storing it, you keep it. If it is less, it was never economical for you to put it on disk in the first place. The maximum value per byte of information we store, however, is unaffected by these costs. An archive that is worth a million dollars is worth a million dollars whether you put it on a flash drive, or burn it on a CD, or copy it to tape. The result of this ever-decreasing cost of disk space is that the ratio between the most valuable data that we possess to the least valuable data that we choose to store increases just as fast as the cost is decreasing. This is true even if any one specific piece of datum is decreasing in value; its place is continually taken by some new piece of information that is just as valuable as the old was.

 

This central reality is a direct consequence of Moore's law, which describes the fourfold transistor density increases occurring in three-year cycles experienced in the last decades.  If our storage cost drops by a factor of four, then the gulf between lowest and highest values expressed as a ratio is also increasing by a factor of four. Our data storage architectures were designed for data a million times more valuable than our least valuable data today. And as disks get cheaper, we spend more total money, not less. Our total capacity increases by at least that factor of four. This additional threefold capacity fills up with data that we chose not to store at the fourfold higher price, suggesting that the bulk of our data generally has value at the lower threshold of our data values. So the bulk of our data, our least valuable data, is protected with an architecture designed for our most valuable data. This is massively inefficient.

 

Either we inadequately protect items of great value, risking catastrophic loss, or we overprotect items of lesser value. This results in wasted money and a loss of valuable information that we might have been able to afford to keep but did not. 

 

In the physical world, we solved these problems a long time ago. For mountains of gold, there was the proverbial Fort Knox, literally protected by an army. For us mere mortals, there is the bank vault. If you can't afford to keep it all there, maybe you have a safe or a locked cabinet. Less valuable things go to the attic, then the garage, or perhaps a shed. Some things you don't mind just leaving in the yard. For the things we can't afford to lose--our house, our car, or our health--we buy insurance. For a fee, someone else agrees to suffer for us the financial consequences of loss and make us whole. In other words, we don't just manage these risks by installing sprinklers, driving carefully, and eating healthfully; we also transfer the risk.

 

What makes that so significant is that it provides an economic mechanism for rationalizing costs. If you could not insure your house, how much would you have to spend and how hard would you have to work to be absolutely sure that it would never burn down? If you spent too little, you might not know that until it's too late. How would you ever know that you were spending too much?

 

So why not buy insurance for data? Traditionally, both insurance that covers property loss and comprehensive general liability (CGL) insurance form the first line of defense for businesses against unexpected financial loss. However, property loss insurance for data is not available anywhere. While business continuity insurance is available and may compensate you for some downtime while you attempt to recover from a disaster, it is just a stop-gap. Some policies may pay for the cost of attempting to recover lost data, but if you are not successful, you could end up with a room full of replacement systems devoid of data.

 

At first sight, it might seem a no-brainer for an insurance company to offer insurance that covers loss for the value of data. A moment's thought, however, exposes the dangers that face an insurance company taking that step. The term used by insurance companies to describe these dangers is moral hazard. The moment an insurance company issues an insurance policy to a restaurant, for example, the probability of a fire occurring instantly rises, even if the restaurant owner is perfectly honest and would never even think of torching his business deliberately. The mere fact that you have insurance lets you relax your guard. You buy insurance so that you can sleep at night, but all the time you spend sleeping is time you are not spending being hyper-vigilant.

 

All of these same moral hazard issues apply to data with a crucial additional facet: Data is easy to copy and easy to hide. There is no way to torch the restaurant while hiding a copy of it somewhere else. For the most part, insurance companies do not regard data as tangible property, and only loss or damage to tangible property can be covered under traditional insurance policies.

 

Some companies have attempted to use CGL insurance to cover their losses when they have been sued by third parties for data loss or have been harmed by downtime. However, here again, most insurance companies do not see data loss as a physical loss or as damage to tangible property, and thus these kinds of losses are not covered under a CGL policy either. In general, the courts have upheld this interpretation, saying that computer data is not physical or tangible property because it cannot be "touched, held, or sensed by the human mind."

Some insurers are now specifically writing into their CGL policies data loss exclusions; others will offer some kinds of liability insurance for data loss for an extra premium but are careful to make many stipulations regarding all of the security measures that must be in place for a claim to be valid. But even if you can meet all of the onerous requirements of a CGL policy covering data liability, buying it to try to protect yourself against data loss is like having only liability insurance for your car; you are protected against the damage claims of others but not for the loss and damage to your own car.

 

For the value of your data itself, companies have had no alternative but to self-insure. Recently popular additional tools for managing the risk of self-insuring include offsite tape storage and online data storage. For example, Iron Mountain, in addition to cleaning out your closets and hauling away boxes of paper for storage, will happily take your tapes too. More recently, they have been purchasing online storage businesses as well. Online storage solutions geared for business data generally range in price from about $2 per GB to $10 per GB. Symantec offers up to 10GB for just $9.99 per month, but like many cell phone plans, their rate almost doubles if you go over the purchased capacity in any given month.

 

The advantage of storing tapes offsite is the power of independence. Your building could be hit by a hurricane as could the tape warehouse, but if they are far enough apart, chances are they won't both be hit at the same time. That's how independence works. The virtue of tape is that it is relatively portable, but the disadvantages are significant. Tape is a devilishly fragile medium that tends to lose information over time and can be foiled by the normal manufacturing variations between apparently identical drives. A tape written on one drive may be unreadable on another virtually identical drive.

 

The logistics of handling physical media create additional nightmares. Trucks get stolen, tapes get lost or mislabeled, tapes take too long to come back, and too often when they do get returned, they are not the tapes you wanted.

 

Moreover, the critical independence you were trying to buy may be an illusion. Some companies that suffered losses in the World Trade Center discovered that their offsite storage was actually located in the same building complex.

 

The advantages of online storage may include faster recovery times, greater geographical separation, and significantly greater convenience. You transfer critical files over the Internet to a remote, hopefully well-run and well-protected site. The cost of online storage has been dropping.

 

Carbonite, for example, aims for the consumer market with a flat rate for all you can back up. Consumer-oriented services achieve a low cost structure by extensive use of de-duplication technologies. There is an enormous amount of duplication among consumer PCs all running the same operating system, having the same software, and gathering the same collections of pirated music and pornography. There is no point in having a million copies of each, one for each of your million customers.

 

The rub is that this trick won't work if files are encrypted. So your file may be encrypted while it is being uploaded, and it may be encrypted again when it is finally stored, but is it ever unencrypted once it leaves your hands? If so, that is no solution for business data.

 

Despite the deficiencies, these approaches are widely used and useful. While data losses can and do occur, your disaster recovery chances are better with these tools than

without them. The key defect in employing these approaches, however, becomes visible in the light of the insurance analogy. While they transfer data, they don't transfer risk. This is highlighted by examining a typical contract for an online storage company. They are not responsible for losing your data even if it was the result of their own negligence. You could pay them to store some archive for years; if they could not return it, they might refund a month or two of the years of fees you have been paying.

 

The best approach is to transfer risk and data together. Transferring risk without transferring the data is what an insurance company would go out of business trying to offer. Transferring the data without transferring the risk is what offsite and online storage companies offer, which leaves you holding the bag when data losses do occur. In this approach, you transfer the risk by declaring the value of your data when you send it. You receive back a signed certificate guaranteeing the storage of that particular file for that particular value. Should you be unable to retrieve your file, you get a check for the value you placed on it.

 

You buy as much or as little protection as that particular archive merits, starting at pennies per month, with a charge based on value. This lets you choose the cost of preserving each of your files. An infinitely scalable storage architecture automatically creates a protection level commensurate with the value.

 

In addition to solving the immediate problem of letting you offload completely your data loss worries, this approach enables a new foundation upon which a much leaner, more adaptive IT architecture will be built. Information can now be preserved according to its value, and the overhead of our most valuable data need no longer limit or restrain our capacity for more rapidly embracing vaster quantities of newly affordable information.

Andrew Winkler

Dr. Andrew Winkler is the founder of Data Risk Management, Inc. and the creator of the technologies that power its solutions. For more information, go to www.DataRiskMgmt.com. Contact Dr. Winkler at This email address is being protected from spambots. You need JavaScript enabled to view it..

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: