BylinesData Protection and BackupData Resilience

5 Data Resilience Lessons to Keep in Mind After the Massive CrowdStrike Outage

Having a Secure Backup and Disaster Recovery Plan Is the Key to Data Resilience

 

On 19 July 2024, an attempt by CrowdStrike to update the “Falcon Sensor” for real-time threat detection and endpoint protection led to a system crash that affected 8.5 million Microsoft Windows devices, causing widespread IT and operational disruptions worldwide. Although this incident was not caused by a cyberattack or malware, it underscores the importance of having a comprehensive and reliable backup and disaster recovery strategy in place to prevent disruptions to business operations. In short, it highlighted the need for data resilience.

CrowdStrike Causes Immediate Global Impact

The outage was first detected in Australia, where the “blue screen of death” spread across Windows devices across the world, significantly disrupting not only users, but also companies and critical service providers. Reports of disruptions emerged from various sectors, including finance, IT, manufacturing, and more. By the afternoon, approximately 2,600 flights in the US were cancelled, while over 4,200 flights were affected globally and had to resort to manual check-ins, according to the Wall Street Journal.

How Long RTOs Impact Business Operations

Following the incident, CrowdStrike provided technical support and released a patch to help restore system operations. However, many systems used by organisations were unable to be automatically recovered via a repair program. When that happens, IT admins have to manually boot every single affected device into safe mode and delete the problematic updates from CrowdStrike.

Though Microsoft introduced a “process-minimising” solution within the next day, which helped automatically delete the faulty files, it was still a labourious process of manually booting individual devices into WinPE via a USB drive. Downtime leads to operations disruptions, loss of productivity, additional costs, increased compliance risks, and ultimately, a negative customer experience and tarnished corporate reputation.

Build a Strong Data Protection Plan to Maintain Business Continuity at All Times

  1. Comprehensive backups. Deploying a backup strategy that regularly covers all sources and devices without isolated data is crucial for businesses, especially those operating across multiple platforms or tools.
  2. Regular restoration drills. Equipment and system failures are never predictable. Continuously testing the recoverability of backup data is essential for verifying the effectiveness and availability of the organisation’s disaster recovery plans and ensuring data resilience.
  3. Instant VM recovery. Virtualising services and restoring operations as quickly as possible ensures reduced downtime and business continuity.
  4. Cross-platform restoration. In CrowdStrike’s case, only one platform was affected. Businesses can minimize the risk of data loss by ensuring that all data, applications, and systems can be recovered and reinstated across multiple environments.
  5. Off-site backup and recovery. In addition to backing up on-site data, implementing an off-site backup mitigates risks associated with data loss. If a company had deployed an off-site cloud backup during CrowdStrike’s event, it could have easily resumed services from the said off-site backup site.

Backups Are the Key to Data Resilience

Having a secure backup and disaster recovery plan is the key to data resilience and a crucial step for any business pursuing digital transformation. The CrowdStrike incident firmly highlights the importance of establishing a robust backup strategy and testing backups on a regular basis to maintain continuity in the face of unforeseen circumstances.

Tony Lin

Product Marketing Manager, Synology

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *