#042 - These broken machines
Remember: never let the machines run unattended.
You're reading Complex Machinery, a newsletter about risk, AI, and related topics. (You can also subscribe to get this newsletter in your inbox.)

Fun fact: a few years ago, I ran a test issue of a newsletter called These Broken Machines. It was a short, punchy rundown of recent AI-related goofs. I enjoyed the writing process, but ultimately decided against continuing it for two reasons:
For one, I didn't want "dunking on AI nonsense" to be my beat. There's plenty of that to go around. I'll steal a friend's line and describe it as people who make their entire personality about hating on things. You've seen it before: the performative screeds. The extended Everything Is Oh So Horrible rants. The thinly-veiled entertainment of it all. They can have it. There's only so much I can muster for the "wow, will you look at this bullshit?" vibe.
Two, I figured that AI goofs would eventually go by the wayside and I'd run out of material.
I was clearly wrong on that second point, but the first one still stands.
I say this to make it clear that Complex Machinery was never meant to be a running commentary on AI failures. When I shelved These Broken Machines, I made a conscious decision to take a wider lens that explores risk-taking in AI and risk/reward tradeoffs, based on recent news. The problem is that the news keeps giving me AI failures to talk about, so … that's what I cover.
(At least in Complex Machinery I go into reasons why it happened, and offer some takeaway lessons. So there's that.)
All that to say: today's first segment is about a recent AI failure.
Letting your bot run with scissors
One perk of writing Complex Machinery is that it's not an up-to-the-moment, this-just-happened kind of newsletter. I can take my time with stories, waiting for additional news to come in and round out my thoughts.
That's especially useful when a story smells funny. And this one certainly does. But since I've seen no coverage to contradict what's been said – no additional coverage beyond the initial wave, even – for now I'll write as though I have all of the relevant public facts.
The story is a simple two-parter:
- A company called SaaStr deployed a Replit-branded AI agent to build an app.
- Several days into the project, the bot deleted the production database.
(For those who don't come from a software development background: an app usually passes through different stages before you, the end-user, see it. Production is the Live, In Front Of Real End-Users, And Handling Real-World Business Matters environment.)
People might blame the AI bot here. I don't.
To understand my view, it helps to separate the error from the incident:
The error was the bot reaching the conclusion to delete the database. We can chalk that up to The Random™, the wild animal that exists inside every AI system. When I first came up with that phrase I was thinking of a forest creature with claws. But thanks to a Bluesky post by Alan Au, I sometimes see The Random™ as a zebra:
Fun fact: you cannot domesticate zebras. That said, there are examples of people coercing zebras to pull carriages. Sure, if you throw enough resources at a problem, you can implement bad ideas, but they're still bad ideas. (Zebras are inherently skittish and try to kick the shit out of everything.)
The incident occurred because someone trusted the bot with the keys to the production database. Given what I just wrote about the error, that sounds rather foolish. But it happened. And it happens more often than we realize because it rarely makes headlines. Companies put way too much faith in a genAI bot, expecting it to exercise good judgement. Which the bot can't do, as it doesn't exercise any judgement at all. It's not a sentient being. It's just a big pile of patterns, baked into a jumble of linear algebra.
To top it off, the SaaStr founder wanted another very human behavior out of this non-human entity: an apology. I was discussing this with Erin McKean (founder of Wordnik!) and she summed it up better than I ever could:
He wants an apology? No. You played the slot machine. You put your money in, you pulled the arm, and you lost … The problem is that this machine can also pull the money out of your pocket.
So … yes. If you ever want an apology for your genAI bot's misdeeds, look in the mirror. And if you'd like to avoid bot problems in general, consider my catchphrase: never let the machines run unattended.
Keep fire in the fireplace
My point is that the SaaStr incident was not an AI problem. It was a people problem: poor risk-taking. Turning the bot loose with woefully insufficient risk controls, thereby opening the door to greet misfortune with open arms.
When it comes to your company's AI risk/reward tradeoff – this is not professional advice, by the by – one of the easiest risk controls is to limit what the bot can do. You can enforce least-privilege access to limit the scope of any damage, and you can supplement that with a human review of anything that touches production. Applying risk controls lets you reap the rewards of using the bot while limiting your exposure to the downsides of using the bot.
To put this in perspective, I think about a great rant by the late comedian Jeremy Hardy. For the life of me, I can no longer find the exact wording. But it boiled down to: Fire's great. If you keep fire in the fireplace, your house will stay warm in winter. But if you let the fire out of the fireplace, and onto the living room rug, it'll burn the house down.
It helps to remember that we only know about the SaaStr incident because the founder chose to make it public. There's no telling how many more incidents (and near-misses) happen and the companies don't tell us. Or because they don't even know. Given the lack of AI literacy in companies, compounded by executives' ill-informed mandates that Everyone Here Must Use AI, I imagine there are a lot more waiting to happen.
And this underscores my wider concern, that we'll get most of our corporate AI safety through an ex-post approach. You know, cleaning up after preventable errors.
It doesn't quite work like that
If the number "996" rings a bell, it's because it made headlines as a particular work culture in China back in 2021. Firms practicing 996 expected employees to work from 9AM to 9PM, six days a week. The grinding, illegal practice allegedly came to a halt once it made headlines.
(If you're a junior associate in investment banking, you're probably thinking: "A 72-hour work week? Wow, that actually sounds nice." But if you're a junior associate in investment banking, you don't have time to read this newsletter. And you probably missed the 996 stories.)
That was four years ago. Thankfully, we can all see that it was a terrible idea and there's no way American firms would try it.
Right?
Not quite. A recent Wired article covered the rise of 996 in American AI startups:
Companies aren’t having trouble finding willing employees, and some frame it as core to their work culture. Rilla, an AI startup that sells software designed for contractors (like plumbers) to record conversations with prospective clients and coach them on how to negotiate higher rates, says nearly all of its 80-person workforce adheres to the 996 schedule.
[...]
Rilla is up front about its expectations. In current job listings, it explicitly states that workers are expected to log more than 70 hours a week, warning them not to join if they aren’t “excited” about the schedule. Breakfast, lunch, and dinner are provided at the office every day—even on Saturdays.
I could tell these founders that 996 is bad for the long-term health of individual team members, and therefore it's bad for the team as a whole. But they clearly don't care about people, so I won't get into that.
Instead, I'll speak on their level: 996 is simply not how effective AI research works. You can't just throw bodies at the problem. Not just because it's a terrible thing to do, but also because research is an inherently probabilistic exercise. 996 may offer the illusion of research progress, because people are Doing Things™ … but that's it. An illusion.
Given that the practice is so ineffective, maybe the founders are attempting to develop a cult? One element of cult on-boarding is to quietly separate you from your friends – anyone who might say "hey man I think this is a cult" – such that the cult becomes your entire world.
Cult math
Why would these startup founders want a cult, though? My business brain initially pointed straight to reduced costs: people who are willing to give so much of themselves to the company would likely work for less money – either directly, through lower salary, or indirectly, by doing more work than their salary merits. If your employees behave like machines, cranking themselves to burnout, you stand a greater chance of reaching your goals before you run out of cash. Getting People To Unflinchingly, Unquestionably Do What You Say is a slow-motion arbitrage play.
But after a moment of reflection, I saw something else: Founder Ego™. Because Getting People To Unflinchingly, Unquestionably Do What You Say is also … attractive to a certain personality type? Especially the personality type that enjoys attention and ordering people around? And what is a cult leader, but a short-lived wannabe deity who couldn't develop mass-market appeal?
To be fair, not all AI startups are trying 996. And even those that try don't always succeed. Going back to that Wired article:
Some founders pitch the schedule as an option for their most devoted employees, creating a two-tiered structure where only some employees are expected to work the extra hours. Ritchie Cartwright, founder of the San Francisco–based telehealth company Fella & Delilah, recently posted about a message he’d sent to employees on LinkedIn, outlining his efforts to shift some of his current staff to a 996 schedule. To entice workers to get on board, Fella & Delilah offered a 25 percent pay increase and a 100 percent increase in equity to willing participants. Just under 10 percent of the staff has signed up, the LinkedIn post claimed.
A 100 percent increase in equity? I witnessed the Dot-Com crash up-close, and I can assure you: getting double the shares that are currently worth as-good-as-zero is, well, not so impressive. And these data scientists clearly did the math.
Cash is king.
After the crash
Is the current genAI hype a bubble in the making, or just an extended bull run? Technically we can't say just yet. But I've been thinking about what might happen should it indeed be a bubble, and if that bubble should burst.
Some bubbles simply collapse and leave financial devastation in their wake. Others leave behind usable infrastructure. Like the 1800s railroad boom, which left mile after mile of usable track. An AI bubble might leave us with … I don't know, maybe surplus power and compute resources?
As I'm working on a longer writeup on this topic, this article about Google's datacenter plans caught my eye:
What sets Google apart in this arms race is its strategy of owning the entire technology pipeline, what Pichai calls a “differentiated, full-stack approach to AI.” This means Google not only designs the world’s most advanced AI models but also controls the physical infrastructure they run on.
[...]
This control over the “full stack” creates a powerful competitive moat. While other companies, even major AI labs, must rent their computing power, Google owns the factory. This is why, as Pichai noted, “nearly all gen AI unicorns use Google Cloud,” and why advanced research labs are specifically choosing Google’s TPUs to train their own models. OpenAI recently said that it expected to use Google’s cloud infrastructure for its popular ChatGPT service.
This sounds like an interesting hedge: if AI continues to be A Thing™, Google makes money selling compute power. Which means they get paid even when a customer's AI dreams fall apart. It's not exactly hockey-stick-growth kind of revenue, but it's steady.
And if AI turns out to be a bubble … well … I'll bet you a nickel that Google is already planning other uses for those datacenters. That sounds like an interesting, though slightly stressful, job for a Google insider. What if you're a Google outsider? Figure out what they have in mind, and you might have a handle on the next big tech boom.
Time will tell.
Next up
The next issue will be a special one.
I won't say what's on tap… but… at least one reader already knows what it is. And another has a good guess.
In other news …
- Oh look, a data leak that included images of government-issued ID. Repeat after me: don't collect data that you can't protect. (New York Times)
- Fashion magazine Vogue ran an ad with an AI-generated model. This raises questions, as well as a few eyebrows. (BBC)
- CNIL, France's data protection watchdog, is not cool with facial recognition for age verification. (Le Monde 🇫🇷)
- This article offers some ideas on how to address LLM vulnerabilities. (WSJ)
- Video game images fool UK age verification tools. (PC Gamer)
- Walmart is consolidating its genAI agents. (WSJ)
The wrap-up
This was an issue of Complex Machinery.
Reading online? You can subscribe to get this newsletter in your inbox every time it is published.
Who’s behind Complex Machinery? I'm Q McCallum. I think a lot about AI and risk, which I write about here.
Disclaimer: This newsletter does not constitute professional advice.