#005 - Broken robots and hidden expenses
You're reading Complex Machinery, a newsletter about risk, AI, and related topics. (You can also subscribe to get this newsletter in your inbox.)
I've been thinking about the list of high-profile AI mishaps, including the one involving the official NYC chatbot. Maybe it's time for me to grab some of that easy chatbot money?
Not building chatbots, mind you. That'd still require effort. Like having to put on a passable "oh how did that happen? Please pay me more to fix it" face when the things inevitably flop. No, it'd be much easier to place bets against upcoming AI projects. These could be run as simple bar bets, prediction markets, whatever.
Am I serious? No. (Maybe.) But if you look at deploying AI solutions from an investment perspective – that is, placing money on a possible future outcome – then it gets easier to see where these companies are stumbling. They're vastly underpricing the risk of using AI, turning their bets into outright gambles.
To see how, we can borrow a page from a trader's handbook:
Pricing out your risk
Trading is a game of balancing risk and reward: you make your money by being right about some future outcome; but you keep that money by managing your risk.
Step 1 in managing that risk is to do a ton of homework on the investment. You want to know all about it so you can understand the payoff characteristics (the return), where things might go wrong, how bad they'll get, and the likelihood of all that happening.
(Meb Faber's chat with Drew Dickson drives that point home. Notice how Dickson keeps asking: How can I be wrong about this? What does the other side of this trade know that I don't? That kind of healthy introspection is how you stay alive in the financial markets.)
Step 2 is to find some alternatives. Where else could you put your money? Maybe there's some less-risky bet, like the so-called "risk-free rate of return" of a super-safe investment? (It should probably be called "rate of return for the lowest-risk alternative" since nothing is truly risk-free … but I get that "risk-free rate" rolls off the tongue more easily.) The risk-free rate may yield less money than something spicier – there's that risk-reward tradeoff again – but it'll be less of a nail-biter and maybe that matters to you.
Step 3 is to compare Steps 1 and 2. Does your intended investment properly compensate you for the risk it carries? Is it head and shoulders above the alternative investment as far as returns, with similar risk? Great, go for it. Does your intended investment show the same expected returns as the alternative, but with a greater chance of failure? Skip it. Go for the alternative.
Simple, right?
It should be. But when it comes to AI, it hasn't been.
This time, it's (not) different
Consider the most popular AI use case: chatbots, also known as conversational AI. This is where a bot summarizes documents for an end-user during an interactive question-and-answer session.
This may seem like a great idea for your business. Think of the money you'll save on customer service!
But to be diligent, you start your homework (Step 1) and learn that the bots may not properly summarize the documents in question. They pick up on grammatical patterns, not facts, which leads them to sometimes "hallucinate" answers that sound nice but are completely wrong. With a little more research you learn that the techniques to address hallucination may influence the output, but not completely control it. Hmm. This was not the magic you were promised in the AI brochure.
From there, you look for a lower-risk alternative (Step 2). You see that Plain Old Search has been around for ages. It isn't as flashy as a chatbot but it's guaranteed to not hallucinate. It's about as low-risk as you can get.
As you compare the risk/return tradeoffs in those options (Step 3), you see:
- An AI chatbot has so-so returns but extremely high risk (it may go off the rails).
- Plain Old Search has pretty good returns but also near-zero risk (because it's just retrieving documents you've written).
It's clear that Plain Old Search is the better deal! Easy.
But that only works if you actually did your homework in Step 1. If you instead half-assed your way through the analysis, you'll have convinced yourself that the chatbot is high-return and low-risk. Plain Old Search is for suckers and luddites, you say. AI chatbots are where it's at.
And this, dear reader, is why we keep witnessing these chatbot mishaps. Companies fail to size up their risk and wind up releasing their terrible AI investment into the wild.
(Cue DJ Khaled's "Congrats, you played yourself." I suppose "And another one" would also fit here…)
To the future
Granted, this won't last forever.
The technology and safeguards should improve over time. That would change the comparison to the risk-free alternative of Plain Old Search and make chatbots a genuinely more attractive bet.
But that's just one possible path and it's a ways off. Today, we still have bots that simply aren't up to the job. And we'll see plenty more chatbot failures before this game is over.
If you're on the outside, that means betting against AI chatbots is a pretty good deal. And if you're on the inside, running a company that plans to release a chatbot … well … remind yourself that education and discipline are cornerstones of risk management. They'll spare you plenty of cash and embarrassing headlines.
What if you're reading this in the future, when AI chatbots are safe and stable? When it's too late to place easy bets against the technology? Fear not. Expect a similar story to play out on the next big AI use case. And then the next emerging-tech use case after that. Because nothing drives a bet quite like Corporate FOMO ™.
Falling into a hole
On a related note …
When you build an AI product you need to test it against misuse. That includes a good dose of red-teaming for anything the public will touch.
In the spirit of Show, Don't Tell, here's a recent example:
AMC Theaters released a limited-edition popcorn bucket as a tie-in with the latest Dune movie. The problem? Some people think it looks like a … particular body part.
I thought it was just SNL's writing team but I've since learned that the Dune Popcorn Bucket Looks Like An Orifice view is rather widespread. Yet, somehow, the AMC Theaters corporate HQ had no idea this might happen:
"We would have never imagined the Dune thing. We would have never created it knowing it would be celebrated or mocked,” AMC Chief Content Officer Elizabeth Frank told Variety about the overwhelming response to the sandworm bucket.
Hmm.
This does raise the question of how AMC could have red-teamed a pop-culture consumer collectible. Conventional wisdom says to run your idea past a group of adolescent boys to spot innuendo and inappropriate angles.
But that's amateur hour. For a real stress-test, get yourself some comedians. They're older. They're more cynical. And their entire job is to explain why some seemingly pedestrian situation is, in fact, ridiculous.
(I believe it was Russell Peters who noted that normal people just say "oh wow, that's messed up" while comedians will say "that's messed up … how can I make it worse?")
What's the equivalent of comedians for AI chatbots? I don't know. "A random sampling of internet yahoos" would be my first guess. But I'm sure there's an even better idea out there. The person who figures that out, and who creates a business to red-team AI chatbots, will make a mint. You heard it here first.
Recommended reading
Allison Schrager's Bloomberg article "The Introverts Have Taken Over the US Economy" is a fun read.
(If you don't have a Bloomberg subscription, she also discusses the article on The New Bazaar episode "Is the Introvert Economy here to stay?")
The title reads like a treatise on the modern dating scene. It's actually a way for Shrager to do her usual of applying financial concepts to everyday life. Related to my tirade about AI chatbots, both the article and the podcast interview explore what it means to place informed bets as framed by the risk-free rate.
For more of Shrager's work, I recommend her book on risk, An Economist Walks Into a Brothel.
In other news …
- Not every company is diving head-first into genAI. "Generative AI Isn’t Ubiquitous in the Business World -- at Least Not Yet" (WSJ)
- Texas threw some educators for a loop when it announced that an AI system would frame certain exams. French: "Au Texas, une IA va corriger des copies pour Économiser le salaire des professeurs" (Les Echos) / English: "How Texas will use AI to grade this year’s STAAR tests" (The Texas Tribune)
- Meta plans to release the latest version of its "Llama" LLM for everyone to use. What's the over/under on the misuse the day after? "Meta confirms that its Llama 3 open source LLM is coming in the next month" (TechCrunch)
- Apparently, models are getting safer already. Is it already too late for my bet against AI chatbot projects? "Google Shows AI Model Is Enterprise-Ready After Gemini Mishaps" (Bloomberg)
The wrap-up
This was an issue of Complex Machinery.
Reading online? You can subscribe to get this newsletter in your inbox every time it is published.
Who’s behind Complex Machinery? I'm Q McCallum. I think a lot about AI and risk, which I write about here.
Disclaimer: This newsletter does not constitute professional advice.