Sometimes when organizations start using big data, it’s not considered a priority in terms of disaster recovery.
Back in January, Lockwood Lyon wrote in Database Journal that there were two common reasons why disaster recovery for big data was given short shrift:
- Because big data is for analytics, it isn’t mission-critical; therefore disaster recovery plans aren’t really necessary.
- Big data is too big for backup and disaster recovery.
However, even if big data isn’t considered mission-critical today, there’s no reason why it won’t be tomorrow. Once people start taking continuous availability of big data for granted, lack of disaster recovery could cause a major outcry.
“Tell them to dump all the big data over there. We’re not sure where it’s going to go permanently.”
A Big Data Criticality Scenario
An increasing number of organizations plan to run big data analytics in dynamic, analytic-driven environments. For example, they may modify online sales tactics according to changing consumer behavior, or perhaps run a major transit system based on minute-to-minute changes in the system’s status. Longer term, organizational strategies may become more dependent on analytics for business outcomes. What happens when big data is hit with a disaster then?
What Works for Small Data May Not Work for Big Data
You can’t really just scale up disaster recovery from a small data environment to a big data environment. Best practices on non-big data aren’t necessarily best practices for big data. For example, while regular backups of application data may be sufficient in a traditional data environment, they may not be the best idea for big data. Before creating a disaster response plan for a big data environment, Lockwood says you have to consider two main things:
• How long can the data remain unavailable without major wailing and gnashing of teeth?
• To what point must data be recovered, whether a specific date and time or to the most recently completed transaction?
Disaster recovery isn’t one size fits all.
Determining a Recovery Point for Big Data
With “normal” data, the defined disaster recovery point is the point of recovery closest to when the interruption occurred. While this is partially true for big data, with big data you have the added consideration of what form you need to recover the data in. Are you OK with recovering raw big data, or big data that’s been refined into a useful format? Most organizations would prefer the latter.
Which Data Should Be Prioritized for Recovery
Today, targeting all of big data for quick disaster recovery is impractical because of its sheer size. Therefore, organizations need to develop a consensus on which data needs to be targeted for recovery in a disaster recovery effort. Some organizations have the added constraint of having to meet regulatory requirements, like HIPAA, for example. Some industries require organizations to keep their big data for a specified amount of time, and many organizations choose to store big data long term so they can develop long term trend analytics. Regulatory requirements and organizational need for long term maintenance of big data will heavily inform your big data disaster recovery plans.
Right now, big data is not seen as mission-critical by many organizations, and that can lead them to think that disaster recovery for big data doesn’t matter. But you can bet that the minute a big data collection is taken out by a flood, fire, or other disaster, someone is going to be upset, especially if that big data contained information used to help the organization cut costs, reduce time-to-market, or gain other competitive advantages. In other words, you really can’t ignore big data in your organization’s disaster recovery plan.
IT service management software like Samanage can be an important part of your company’s disaster recovery plan. Because it’s a cloud solution, with a unified interface for both IT service desk and IT asset management functions, it can be up and running quickly after a disaster. Furthermore, its powerful IT asset management features can be indispensable when you’re trying to account for hardware and software after disaster strikes. Samanage gives your organization one less thing to worry about when disaster hits.