By Assis Ngolo in Cyber Security — Jul 26, 2024

Vendor Security Lessons from the CrowdStrike Incident

The CrowdStrike incident underscores the need for reliable IT security vendors after a problematic update caused a global outage. This article examines the incident's impact on sectors like banking and healthcare and discusses strategies for mitigating vendor-related risks.

Photo by Milad Fakurian / Unsplash

The CrowdStrike incident has rocked the cybersecurity world showing how crucial it is to have reliable vendors in IT security. CrowdStrike, a top endpoint security provider, rolled out a software update that caused a worldwide IT outage. This event exposed the weak spots that can pop up when you rely on outside security solutions. The fallout spread to many areas, like banking, healthcare, and aviation. This shows how far-reaching these failures can be in our interconnected digital world.

As companies deal with the aftermath of this event, it's clear that picking reliable IT security providers is more important than ever. This article explores the CrowdStrike update problem looking at what it means for business security and ways to reduce risks linked to vendors. By studying this case, we aim to share insights on how to choose vendors, check them , and plan for emergencies. This can help businesses boost their cyber defenses in a world where threats keep getting more complex.

Taking Apart the CrowdStrike Update Problem

A Look at the Worldwide IT Breakdown

On July 19, 2024, at 04:09 UTC, CrowdStrike released a sensor configuration update to Windows systems as part of its normal operations. This update aimed to improve protection mechanisms, but it caused a logic error that led to system crashes and the well-known "Blue Screen of Death" (BSOD) on affected devices. The problem had a wide reach affecting about 8.5 million Windows devices around the world, which makes up less than 1% of all Windows devices .

Instant Effects on Businesses

The aftermath of this failed update hit hard and fast. Key industries faced big disruptions:

Aviation: About 5,000 commercial flights were canceled hitting airports in the US, Europe, Asia, and Oceania.
Banking: Banks like Bradesco, Neon, and Next in Brazil saw their services go down.
Healthcare: Places such as the Hospital de Clínicas de São Paulo had to push back procedures and go back to pen-and-paper records.
Energy: Power companies ran into operational problems.
Technology: Cloud systems serving thousands of businesses took a hit causing a chain reaction across worldwide infrastructure.

Technical Details of the Faulty Update

The faulty update had a connection to Channel File 291, which is part of CrowdStrike's Falcon platform. You can find this file in the C:\Windows\System32\drivers\CrowdStrike\ folder. It manages how Falcon checks named pipe execution on Windows systems. The update aimed to tackle spotted harmful named pipes used in cyber attacks. But it had a logic bug that led to reading memory outside its bounds causing an error that the system couldn't handle well.

How It Affected Windows Systems

The buggy update set off a chain of problems:

Systems with Falcon sensor for Windows version 7.11 and higher crashed if they got the update between 04:09 UTC and 05:27 UTC on July 19, 2024.
Systems that had problems entered a reboot loop making them unusable.
This problem didn't just affect on-site systems; it also had an impact on cloud services, including Microsoft Azure virtual machines.

How CrowdStrike Tackled and Fixed the Issue

CrowdStrike took these steps to address the crisis:

Rolled back the problematic update at 05:27 UTC about 79 minutes after they first deployed it.
Published a manual fix for affected systems. Users had to boot into Safe Mode and delete the faulty Channel File.
Marked the impacted version of the channel file as "known-bad" in the CrowdStrike Cloud.
Promised to make their testing better. This includes local tests before client deployment better stability and content interface testing, and a step-by-step approach for future updates.

This event shows how crucial good testing is. It also highlights how software updates can have big effects in our connected digital world.

What This Means for Enterprise Security

The CrowdStrike incident shows important lessons for business security stressing the need to balance IT management and cybersecurity.

Risks of Automated Update Processes

Automated updates keep systems secure, but they can cause big problems:

Unexpected issues: The CrowdStrike case proves that one bad update can shut down many systems.
Too much trust in automation: Companies might think "set it and forget it," leading them to slack off on watching their systems.
Hackers can take advantage: Bad guys might use weak spots in update systems to spread viruses or take over entire networks.

Challenges in Managing Complex IT Ecosystems

Today's businesses struggle with several issues when it comes to managing their IT systems:

Finding the right mix of security and ease of use: When security gets too complicated, it can mess up how work gets done and slow things down.
Keeping talented people: When companies can't hold on to skilled security experts, they might fall behind on updates and putting new measures in place.
Too much complexity: With so many different tools to deal with various security threats, systems have become tricky and hard to handle.

Striking a Balance Between Security and Smooth Operations

It's essential to keep security tight while making sure everything runs :

Update resistance: Systems in stable condition often make people scared to install important security fixes causing delays.
Testing challenges: Security teams with too much work may not have enough time or tools to test updates well before using them.
Need for integrated solutions: Easy-to-use, combined, and automatic security tech can cut down on manual tasks and make people more sure about putting updates in place.

Ways to Lower Risks from Vendors

Setting Up Tough Update Testing Steps

To cut down on dangers from vendor updates, companies should set up full testing steps. This means doing careful checks of updates before using them, including local tests and better stability checks. Using a step-by-step plan for future updates can help spot possible problems before they affect the whole system.

Improving Incident Response and Recovery Strategies

Creating a strong incident response plan has a crucial impact on reducing the effects of vendor-related incidents. This plan should include clear steps to identify, contain, and lessen cyber incidents. To test and update the incident response plan , along with running simulated exercises, ensures that all team members are ready to handle potential crises well.

Enhancing Vendor Evaluation and Diversifying Security Solutions

Companies need to check IT security vendors before picking them. This means looking at their credentials, money situation, and whether they follow industry rules. Using different suppliers can cut down on weak spots and boost how well things run. Also, using tools that watch things all the time helps catch and fix problems right away making sure vendors keep doing what the company wants.

Keeping an eye on how much access vendors have

Security tools, like Crowdstrike sensor, should have careful and watched access to the system at the core level. Companies should spell out how important different levels of access are by how deep they go into the system and how far they reach in the network. Vendors should tell and explain why they need certain access, and have plans to lessen problems and recover from any issues their tools might cause at each risk level.

Investing in Redundancy and Failover Systems

To keep businesses running during vendor-related problems, companies need to have backup and failover systems in place. This means using extra servers, power supplies, and network gear. Failover plans help cut downtime by moving traffic and services to backup systems when things go wrong. Companies should also think about using load balancers and virtual tech to make their systems stronger and use resources better.

Conclusion

The CrowdStrike incident shows how crucial it is for companies to stay alert when dealing with IT security vendors. This event highlights why businesses need to check out vendors, do thorough testing, and have solid plans to handle problems. By taking these steps, companies can better shield themselves from the widespread effects of vendor mistakes and boost their overall cybersecurity defenses.

As the digital scene keeps changing, the link between companies and their IT security providers grows more and more important. Checking how well they're doing knowing about their security steps, and having a good backup plan are key ways to cut down on risks. In the end, it's crucial for businesses to stay ahead in handling their IT security teamups. If you want to look over your security setup and provider risk, feel free to get in touch with us for expert advice.