#002 - It's still a wild animal
You're reading Complex Machinery, a newsletter about risk, AI, and related topics. (You can also subscribe to get this newsletter in your inbox.)
Do companies dream of functioning AI?
(With apologies to Philip K. Dick.)
Companies are just head over heels about genAI bots doing customer service. My latest interaction – a bot that cooked up a perfectly plausible, but entirely fictional solution to my problem – tells me the executives are still a little hopeful. Dreaming, even.
And mine was a relatively mild encounter. Others have not been so lucky. Consider delivery service DPD, which saw its genAI chatbot turn against it to side with an angry customer. Then there was the car dealership bot that agreed to sell someone a vehicle for a dollar. It was even kind enough to point out, "That’s a deal, and that’s a legally binding offer – no takesies backsies.” Not that the bot always has to declare its words legally binding. A court has ruled that Air Canada has to stand by its chatbot's erroneous statement and issue a refund to make things right.
Remind me again why companies are so enamored with this use case?
I mean, yes, I know the real reason: it's because every executive is being bombarded with pitches of "the bot will just magically do all of the work!" and "you can fire so many people!!" Those claims drown out cautionary tales of AI Bots Gone Rogue. The few such stories that make it through the wall of hype are just waved off: "This happens, sure. But it won't happen to me."
Oh, but it will.
If industry giants like Facebook and Google have already stumbled in the genAI game, that leaves me little hope for the smaller, less-experienced players. To deploy a public-facing chatbot – especially one that fulfills a formal company duty – is to mismatch your risk and reward.
(That's business-speak for "you're gambling.")
It's a package deal
This is when readers will ask: "OK, smart guy – how do you make these chatbots safe, then?"
As a consultant in the AI space, I can tell you two things:
1/ This newsletter does not constitute consulting advice.
2/ You can't make the bots 100% safe. Safer, yes. Safe, no. At least not with today's technology.
The harsh reality is that genAI bots are inherently probabilistic machinery. Their underlying models (the LLMs) are not only built on randomness; they also provide some degree of randomness in their outputs. The Random™ is always in there somewhere. And it is the main source of risk.
To see why this is an issue, let's say I were to bring a wild animal into my home. Give it a cutesy name, a party hat, and its own TikTok channel. You would certainly remind me that it's still a wild animal. "Sooner or later you're going to make a furtive gesture that triggers its Bring Out The Pointy Bits reflex. And then someone gets hurt."
(Extra points to anyone who thought of Chris Rock's line "that tiger didn't go crazy; that tiger went tiger!" just now.)
The Random™ drives the always/never tradeoff inherent in genAI bots. If you're not familiar with that concept, I'll do that lazy thing where I lift from something else I wrote to explain it:
The “always/never” concept hails from nuclear safety and is detailed in Eric Schlosser’s book Command and Control. The idea is that you want a nuclear weapon that always launches and detonates when it’s supposed to, and never launches or detonates when it’s not supposed to. As nuclear weapons are intended to cause widespread damage, this blend of always and never forms a desirable state.
It’s also an impossible state.
Any nuclear missile that has a chance of functioning at all will also have a chance of functioning when you don’t want it to do so. [...] The mere act of having such a weapon means you bear the risk of starting World War III.
Zooming out, always/never is a particular framing of the risk/reward concept. If you want the reward of “being able to launch a nuke at your enemies” then you shoulder the risk of “that same nuke going off at the wrong time.” It’s a package deal.
We can rewrite that last bit as:
If you want the reward of "an AI chatbot that handles customer service," then you shoulder the risk of "that same AI chatbot saying something wholly inappropriate to a customer." It's a package deal.
The bot will misbehave either because it was coaxed into doing so by a malicious end-user, or simply by doing what chatbots do: following linguistic (not logical, not factual) patterns to return something that meets linguistic (not logical, not factual) rules.
Maybe you're okay with that? Sometimes the nonsense is a well-deserved dose of whimsy. But if your use case has the bot speaking on behalf of your business, and a court will later hold you to whatever the bot says … well … it might be time for a think.
Stay predictable
Having read that, if you still insist on using generative AI for customer service – again, not consulting advice – then you can try to tame the bot. I've mentioned elsewhere that you can (and, really, should) do things like filter its inputs and outputs, restrict access when possible, and perform red-team exercises to expose flaws before bad actors do that for you. You can also look into techniques to further constrain the outputs, such as the increasingly-popular retrieval-augmented generation (RAG).
The problem? Taming the bot still leaves you with … a bot. So you still face the risk that The Random™ will leak out and bite you.
If the stakes are high – if the answer must always be correct and appropriate – then a probabilistic system is not for you. That leads to the second approach, which is to implement a search system for self-service customer support.
Search is great! People are used to it. It's deterministic. Best of all, it can't hallucinate. Search will only return the documents you have fed it. So long as those docs are accurate and up-to-date, it's damned hard to get in trouble.
Sum total: if you want to offload some customer service work to technology, maybe skip the genAI chatbots for now and implement a search system.
Or, you could simply reply to every customer support request with a middle finger emoji. It's faster than a chatbot. It costs less. And it'll get you into the same amount of trouble.
Reading list
Unrelated to customer service bots, but still on the topic of risk:
Michele Wucker, author of The Gray Rhino and You Are What You Risk, launched "The Gray Rhino Wrangler" newsletter earlier this year. Highly recommended reading.
Thus far she's covered finance, geopolitics, climate change, and everyone's favorite topic of AI. I can't wait to see what's next.
In other news …
- Even when you try to do the right thing, it can still go wrong. "‘Embarrassing and wrong’: Google admits it lost control of image-generating AI" (TechCrunch)
- I think we all saw this coming. Did we not see this coming? "23andMe Admits ‘Mining’ Your DNA Data Is Its Last Hope" (Gizmodo)
- I actually did not see this coming. At least, not quite so soon. "AI cannot be used to deny health care coverage, feds clarify to insurers" (Ars Technica)
- Putting asterisks on knockoff reality. "Meta pushes to label all AI images on Instagram and Facebook in crackdown on deceptive content" (The Guardian)
The wrap-up
This was an issue of Complex Machinery.
Reading this online? You can subscribe to get this newsletter in your inbox every time it is published.
Who’s behind Complex Machinery? I'm Q McCallum. I think a lot about AI and risk, which I write about here.
Disclaimer: This newsletter does not constitute professional advice.