• Eager Eagle@lemmy.world
    link
    fedilink
    English
    arrow-up
    107
    arrow-down
    1
    ·
    2 months ago

    Many people need to shift away from this blaming mindset and think about systems that prevent these things from happening. I doubt anyone at CrowdStrike desired to ground airlines and disrupt emergency systems. No one will prevent incidents like this by finding scapegoats.

    • people_are_cute@lemmy.sdf.org
      link
      fedilink
      arrow-up
      15
      ·
      2 months ago

      That means spending time and money on developing such a system, which means increasing costs in the short term… which is kryptonite for current-day CEOs

      • Eager Eagle@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        edit-2
        2 months ago

        Right. More than money, I say it’s about incentives. You might change the entire C-suite, management, and engineering teams, but if the incentives remain the same (e.g. developers are evaluated by number of commits), the new staff is bound to make the same mistakes.

    • azertyfun@sh.itjust.works
      link
      fedilink
      arrow-up
      13
      ·
      edit-2
      2 months ago

      I strongly believe in no-blame mindsets, but “blame” is not the same as “consequences” and lack of consequences is definitely the biggest driver of corporate apathy. Every incident should trigger a review of systemic and process failures, but in my experience corporate leadership either sucks at this, does not care, or will bury suggestions that involve spending man-hours on a complex solution if the problem lies in that “low likelihood, big impact” corner.
      Because likely when the problem happens (again) they’ll be able to sweep it under the rug (again) or will have moved on to greener pastures.

      What the author of the article suggests is actually a potential fix; if developers (in a broad sense of the word and including POs and such) were accountable (both responsible and empowered) then they would have the power to say No to shortsighted management decisions (and/or deflect the blame in a way that would actually stick to whoever went against an engineer’s recommendation).

  • Kissaki@programming.dev
    link
    fedilink
    English
    arrow-up
    47
    arrow-down
    1
    ·
    2 months ago

    CrowdStrike ToS, section 8.6 Disclaimer

    […] THE OFFERINGS AND CROWDSTRIKE TOOLS ARE NOT FAULT-TOLERANT AND ARE NOT DESIGNED OR INTENDED FOR USE IN ANY HAZARDOUS ENVIRONMENT REQUIRING FAIL-SAFE PERFORMANCE OR OPERATION. NEITHER THE OFFERINGS NOR CROWDSTRIKE TOOLS ARE FOR USE IN THE OPERATION OF AIRCRAFT NAVIGATION, NUCLEAR FACILITIES, COMMUNICATION SYSTEMS, WEAPONS SYSTEMS, DIRECT OR INDIRECT LIFE-SUPPORT SYSTEMS, AIR TRAFFIC CONTROL, OR ANY APPLICATION OR INSTALLATION WHERE FAILURE COULD RESULT IN DEATH, SEVERE PHYSICAL INJURY, OR PROPERTY DAMAGE. […]

    It’s about safety, but truly ironic how it mentions aircraft-related twice, and communication systems (very broad).

    It certainly doesn’t impose confidence in the overall stability. But it’s also general ToS-speak, and may only be noteworthy now, after the fact.

    • lad@programming.dev
      link
      fedilink
      English
      arrow-up
      6
      ·
      2 months ago

      That’s just covering up, like a disclaimer that your software is intended to only be used on 29ᵗʰ of February. You don’t expect anyone to follow that rule, but you expect the court to rule that the user is at fault.

      Luckily, it doesn’t always work that way, but we will see how it turns out this time

    • trolololol@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      2 months ago

      I’m pretty sure if a client pays for use in any of that they’ll shut up and take the money. Pretty ethical.

  • iAvicenna@lemmy.world
    link
    fedilink
    arrow-up
    44
    arrow-down
    1
    ·
    2 months ago

    sure it is the dev who is to blame and not the clueless managers who evaluate devs based on number of commits/reviews per day and CEOs who think such managers are on top of their game.

      • iAvicenna@lemmy.world
        link
        fedilink
        arrow-up
        12
        arrow-down
        2
        ·
        2 months ago

        I don’t have any information on that, this was more like a criticism of where the world seems to be leading to

        • FlorianSimon@sh.itjust.works
          link
          fedilink
          arrow-up
          9
          ·
          2 months ago

          I’ve been working as a professional programmer for many years and have never ever seen this kind of evaluation, not even once. I’m pretty convinced it’s an exception rather than a rule. And I’d add that it’s probably a very rare exception.

          • iAvicenna@lemmy.world
            link
            fedilink
            arrow-up
            7
            ·
            edit-2
            2 months ago

            NGL I am also a second hand witness to it. This particular example may be a few but there are a lot of others to the same effect: evaluating performance based on number of lines of code, trying to combine multiple dev responsibilities into a single position, unrealistic deadlines which can usually be met very superficially, managers looking for opportunities to replace coders with AI and further tasking other devs with AI code checking responsibilities, replacing experienced coders with newly graduates because they are willing to work more for less. All of these are some form of quantity over quality and usually end up with some sort of crisis.

            • Ephera@lemmy.ml
              link
              fedilink
              arrow-up
              7
              ·
              2 months ago

              Yeah, and at the end of the day, it is just as much a very rare exception that a dev actually gets enough time to complete their work at a level of quality they would take responsibility for.
              Hell, it is standard industry practice to ship things and then start fixing the issues that crop up.

    • jonne@infosec.pub
      link
      fedilink
      arrow-up
      3
      ·
      2 months ago

      Yeah exactly. You’d think they’d have a test suite before pushing an update, or do a staggered rollout where they only push it to a sample amount of machines first. Just blaming one guy because you had an inadequate UAT process is ridiculous.

  • Kissaki@programming.dev
    link
    fedilink
    English
    arrow-up
    20
    ·
    edit-2
    2 months ago

    It’s a systematic multi-layered problem.

    The simplest, least effort thing that could have prevented the scale of issues is not automatically installing updates, but waiting four days and triggering it afterwards if no issues.

    Automatically forwarding updates is also forwarding risk. The higher the impact area, the more worth it safe-guards are.

    Testing/Staging or partial successive rollouts could have also mitigated a large number of issues, but requires more investment.

    • wizardbeard@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      8
      ·
      2 months ago

      The update that crashed things was an anti-malware definitions update, Crowdstrike offers no way to delay or stage them (they are downloaded automatically as soon as they are available), and there’s good reason for not wanting to delay definition updates as it leaves you vulnerable to known malware longer.

    • Gestrid@lemmy.ca
      link
      fedilink
      English
      arrow-up
      5
      ·
      2 months ago

      Four days for an update to malware definitions is how computers get infected with malware. But you’re right that they should at least do some sort of simple test. “Does the machine boot, and are its files not getting overzealously deleted?”

      • Kissaki@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 months ago

        One of the fixes was deleting a sysm32 driver file. Is a Windows driver how they update definitions?

        • Gestrid@lemmy.ca
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          2 months ago

          The driver was one installed on the computer by the security company. The driver would look for and block threats incoming via the internet or intranet.

          The definitions update included a driver update, and most of the computers the software was used on were configured to automatically restarted to install the update. Unfortunately, the faulty driver update caused computers to BSOD and enter a boot loop.

          Because of the boot loop, the driver could only be removed manually by entering Safe Mode. (That’s the thing you saw about deleting that file.) Then the updated driver, the one they released when they discovered the bug, would ideally be able to be installed normally after exiting Safe Mode.

  • Seasm0ke@lemmy.world
    link
    fedilink
    arrow-up
    19
    arrow-down
    1
    ·
    2 months ago

    Reading between the lines, crowdstrike is certainly going to be sued for damages, putting a Dev on the hook means nobody gets - or pays - anything so long as one guy’s life gets absolutely ruined. Great system

  • hamid 🏴@vegantheoryclub.org
    link
    fedilink
    arrow-up
    18
    arrow-down
    4
    ·
    edit-2
    2 months ago

    Crowdstrike CEO should go to jail. The corporation should get the death sentence.

    Edit: For the downvoters, they for real negligently designed a system that killed people when it fails. The CEO as an officer of the company holds liability. If corporations want rights like people when they are grossly negligent they should be punished. We can’t put them in jail so they should be forced to divest their assets and be “killed.” This doesn’t even sound radical to me, this sounds like a basic safe guard against corporate overreach.

  • corsicanguppy@lemmy.ca
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    1
    ·
    2 months ago

    We don’t blame the leopards who ate the guy’s face. We blame the guy who stuck his face near the leopards.

    • Kissaki@programming.dev
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 months ago

      But how do you identify a leopard when you don’t know about animals and it’s wearing a shiny mask?