Skip to content
Surf Wiki
Save to docs
general/computer-data

From Surf Wiki (app.surf) — the open knowledge base

Continuous data protection


Continuous data protection (CDP), also called continuous backup or real-time backup, refers to backup of computer data by automatically saving a copy of every change made to that data, essentially capturing every version of the data that the user saves. In its true form it allows the user or administrator to restore data to any point in time.{{cite web |access-date=29 November 2016}}

In an ideal case of continuous data protection, the recovery point objective—"the maximum targeted period in which data (transactions) might be lost from an IT service due to a major incident"—is zero, even though the recovery time objective—"the targeted duration of time and a service level within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity"—is not zero.{{cite web |url-status=dead

CDP runs as a service that captures changes to data to a separate storage location. There are multiple methods for capturing continuous live data changes involving different technologies that serve different needs. True CDP-based solutions can provide fine granularities of restorable objects ranging from crash-consistent images to logical objects such as files, mail boxes, messages, and database files and logs. This isn't necessarily true of near-CDP solutions.

Differences from traditional backup

True continuous data protection is different from traditional backup in that it is not necessary to specify the point in time to recover from until ready to restore. Traditional backups only restore data from the time the backup was made. True continuous data protection, in contrast to "snapshots", has no backup schedules. When data is written to disk, it is also asynchronously written to a second location, either another computer over the network or an appliance. This introduces some overhead to disk-write operations but eliminates the need for scheduled backups.

Allowing restoring data to any point in time, "CDP is the gold standard—the most comprehensive and advanced data protection. But 'near CDP' technologies can deliver enough protection for many companies with less complexity and cost. For example, snapshots "near-CDP" clarification in [the section below] can provide a reasonable near-CDP-level of protection for file shares, letting users directly access data on the file share at regular intervals—say, every half hour or 15 minutes. That's certainly a higher level of protection than tape-based or disk-based nightly backups and may be all you need." it is essentially incremental backup initiated—separately for each source machine—by timer instead of script.

Continuous vs near continuous

Since true CDP "backup write operations are executed at the level of the basic input/output system (BIOS) of the microcomputer in such a manner that normal use of the computer is unaffected", or equivalent—ruling it out for ordinary personal backup applications. It is therefore discussed in the "Enterprise client-server backup" article, rather than in the "Backup" article.

Some solutions marketed as continuous data protection may only allow restores at fixed intervals such as 15 minutes or one hour or 24 hours, because they automatically take incremental backups at those intervals. Such "near-CDP"—short for near-continuous data protection—schemes are not universally recognized as true continuous data protection, as they do not provide the ability to restore to any point in time. When the interval is shorter than one hour, "near-CDP" solutions—for example Arq Backup—are typically based on periodic "snapshots"; "to avoid downtime, high-availability systems may instead perform the backup on ... a read-only copy of the data set frozen at a point in time—and allow applications to continue writing to their data".

There is debate in the industry as to whether the granularity of backup must be "every write" to be CDP, or whether a "near-CDP" solution that captures the data every few minutes is good enough. The latter is sometimes called near continuous backup. The debate hinges on the use of the term continuous: whether only the backup process must be continuously automatically scheduled, which is often sufficient to achieve the benefits cited above, or whether the ability to restore from the backup also must be continuous. The Storage Networking Industry Association (SNIA) uses the "every write" definition.

There is a briefer sub-sub-section in the "Backup" article about this, now renamed to "Near-CDP" to avoid confusion.

Differences from RAID, replication or mirroring

Continuous data protection differs from RAID, replication, or mirroring in that these technologies only protect one copy of the data (the most recent). If data becomes corrupted in a way that is not immediately detected, these technologies simply protect the corrupted data with no way to restore an uncorrupted version.

Continuous data protection protects against some effects of data corruption by allowing restoration of a previous, uncorrupted version of the data. Transactions that took place between the corrupting event and the restoration are lost, however. They could be recovered through other means, such as journaling.

Backup disk size

In some situations, continuous data protection requires less space on backup media (usually disk) than traditional backup. Most continuous data protection solutions save byte or block-level differences rather than file-level differences. This means that if one byte of a 100 GB file is modified, only the changed byte or block is backed up. Traditional incremental and differential backups make copies of entire files; however starting around 2013 enterprise client-server backup applications have implemented a capability for block-level incremental backup, designed for large files such as databases.

Risks and disadvantages

When real-time edits—especially in multimedia and CAD design environments—are backed up offsite over the upstream channel of the installation's broadband network,{{cite web

References

References

  1. Pat Hanavan. (2007). "An Overview of Continuous Data Protection". Infosectoday.com.
  2. (23 October 2017). "Data Protection Best Practices". Storage Networking Industry Association.
  3. (4 March 2017). "EMC RecoverPoint for Virtual Machine Overview". WuChiKin.
  4. (21 September 2009). "Symantec Brings RealTime CDP into NetBackup Data Management Fold". DCIG LLC.
  5. (July 2010). "Continuous data protection (CDP) explained: True CDP vs near-CDP". TechTarget.
  6. (March 2017). "Zerto or Veeam?".
  7. (2019). "Agent Related".
  8. (25 May 2013). "FAQ 13. How are [Time Machine] backups scheduled (and can I change that)?". Baligu.com (as mirrored after James Pond died in 2013).
  9. (5 July 2017). "Troubleshooting backing up open/locked files on Windows". Haystack Software LLC.
Info: Wikipedia Source

This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.

Want to explore this topic further?

Ask Mako anything about Continuous data protection — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report