Source

idk if people on tumblr know about this but a cybersecurity software called crowdstrike just did what is probably the single biggest fuck up in any sector in the past 10 years. it’s monumentally bad. literally the most horror-inducing nightmare scenario for a tech company.

some info, crowdstrike is essentially an antivirus software for enterprises. which means normal laypeople cant really get it, they’re for businesses and organisations and important stuff.

so, on a friday evening (it of course wasnt friday everywhere but it was friday evening in oceania which is where it first started causing damage due to europe and na being asleep), crowdstrike pushed out an update to their windows users that caused a bug.

before i get into what the bug is, know that friday evening is the worst possible time to do this because people are going home. the weekend is starting. offices dont have people in them. this is just one of many perfectly placed failures in the rube goldburg machine of crowdstrike. there’s a reason friday is called ‘dont push to live friday’ or more to the point ‘dont fuck it up friday’

so, at 3pm at friday, an update comes rolling into crowdstrike users which is automatically implemented. this update immediately causes the computer to blue screen of death. very very bad. but it’s not simply a ‘you need to restart’ crash, because the computer then gets stuck into a boot loop.

this is the worst possible thing because, in a boot loop state, a computer is never really able to get to a point where it can do anything. like download a fix. so there is nothing crowdstrike can do to remedy this death update anymore. it is now left to the end users.

it was pretty quickly identified what the problem was. you had to boot it in safe mode, and a very small file needed to be deleted. or you could just rename crowdstrike to something else so windows never attempts to use it.

it’s a fairly easy fix in the grand scheme of things, but the issue is that it is effecting enterprises. which can have a looooot of computers. in many different locations. so an IT person would need to manually fix hundreds of computers, sometimes in whole other cities and perhaps even other countries if theyre big enough.

another fuck up crowdstrike did was they did not stagger the update, so they could catch any mistakes before they wrecked havoc. (and also how how HOW do you not catch this before deploying it. this isn’t a code oopsie this is a complete failure of quality ensurance that probably permeates the whole company to not realise their update was an instant kill). they rolled it out to everyone of their clients in the world at the same time.

and this seems pretty hilarious on the surface. i was havin a good chuckle as eftpos went down in the store i was working at, chaos was definitely ensuring lmao. im in aus, and banking was literally down nationwide.

but then you start hearing about the entire country’s planes being grounded because the airport’s computers are bricked. and hospitals having no computers anymore. emergency call centres crashing. and you realised that, wow. crowdstrike just killed people probably. this is literally the worst thing possible for a company like this to do.

crowdstrike was kinda on the come up too, they were starting to become a big name in the tech world as a new face. but that has definitely vanished now. to fuck up at this many places, is almost extremely impressive. its hard to even think of a comparable fuckup.

a friday evening simultaneous rollout boot loop is a phrase that haunts IT people in their darkest hours. it’s the monster that drags people down into the swamp. it’s the big bag in the horror movie. it’s the end of the road. and for crowdstrike, that reaper of souls just knocked on their doorstep.

  • Telorand@reddthat.com
    link
    fedilink
    English
    arrow-up
    9
    ·
    4 months ago

    Fun read. And like they pointed out, a basic smoke test would have identified that something was wrong before pushing the update.

    For how much damage was done, I have to wonder if we’ll see an FTX-style inquiry in the coming months.

    • onlinepersona@programming.dev
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      3
      ·
      4 months ago

      I have to wonder if we’ll see an FTX-style inquiry in the coming months.

      I wouldn’t hold my breath. Companies like that are so involved with governments that they probably are bullet-proof. Microsoft has had appalling security practices (money over everything) and IIRC hackers got access to virtually every Outlook account, and all that Microsoft did was issue a statement of “we’re sorry and take immediate action”. That placated politicians and they moved on.
      Crowdstrike is most likely in the same boat.

      Anti Commercial-AI license

      • edric@lemm.ee
        link
        fedilink
        English
        arrow-up
        4
        ·
        4 months ago

        Wasn’t the Crowdstrike CTO (or some other C-level) requested to appear before congress to talk about this issue? I think they’re more vulnerable than Microsoft because they are just a single piece of software that can be replaced by a number of other similar endpoint security products, unlike Microsoft which is more embedded and a lot of government systems are overly reliant with (Windows, Azure, etc.).

        • DirigibleProtein@aussie.zone
          link
          fedilink
          English
          arrow-up
          4
          ·
          4 months ago

          I doubt anything will happen. They’re “too big to fail”. people will still want their products and we’ll move on and forget. Like Boeing or Microsoft or Sony or Tesla, Crowdstrike will issue a media release that basically says, “Oh whoops, my bad!” and just keep going.

        • Bakkoda@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          4 months ago

          Imagine trying to explain AV vs EDR to a current sitting congressional committee member. You would need weeks of prep time to find an analogy dumb enough to name sense.

  • kraynyan
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    4 months ago

    They’ll never push an update without testing a reboot first again…

    • JeeBaiChow@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 months ago

      They should never have done it in the first place. But the marketing FUD of zero day vulnerabilities and the speed in which they are exploited got to their clients to allow a third party auto update. But y’know, save a few bucks for the bean counters, and get the signee a good cash back deal…

  • Optional@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    2
    ·
    4 months ago

    I expect they had contractors for QE testing, laid them off and now are, what’s the word, “fucked”