#027 - A difference of time
GenAI product companies live in the future. Too bad their customers are in the present.
You're reading Complex Machinery, a newsletter about risk, AI, and related topics. (You can also subscribe to get this newsletter in your inbox.)

The time-travelling salesman problem
I've mentioned before that AI companies face a favorable risk/reward tradeoff: they push most of the downside exposure to buyers or the public, while keeping the upside gain for themselves. Just about anything they create will pay off!
Emphasis on "just about." Last week Apple re-learned that lesson when it pared back its AI-generated notification summaries in iOS. They dethroned recent champion Google AI Overview, of Glue On Pizza fame.
I'm not entirely surprised. When the "Apple Intelligence" (their term, not mine) genAI bot mangled some noteworthy news headlines a few weeks back, I pointed out:
[The botched summaries] follow the same troublesome road as a lot of AI-based summarization: the bot mixes things up, in part because it lacks much-needed context. And also because AI companies confuse "give me the gist of what's going on here" with "give me a probabilistic string of words based on this other pile of words over here."
Apple Intelligence isn't the only AI product to show lackluster performance. I just read up on the use of AI-based video surveillance in the 2024 Olympic Games in Paris. The machines mistook homeless people for abandoned packages and car headlights for fires. According to that article, 62% of alerts to local rail system (SNCF) authorities were false alarms. Of the remaining alarms, only a handful were considered relevant. And yet, proponents want to see more AI-based surveillance.
AI-based text summarization, video surveillance, chatbot companions (more on this in a moment), and other use cases might work someday. Maybe. But today they are in perpetual beta – not ready for primetime, yet in widespread use.
Why do AI companies keep pushing the technology? Part of it is the aforementioned risk/reward tradeoff. Another reason is that the people buying and selling AI live in different times.
Tech companies, their fans, and their investors all live in the future. The companies' pitches focus on the possibilities of a product that has yet to be built. "We don't have anything yet, but we will, so we'd like our valuation to reflect that." Their investors dig this because they get paid years down the road, when the company goes public or gets acquired. And the fans, they see quirks and glitches as a small price to pay for cool functionality. All three groups are dead-set on deriving benefit Sometime Later™ .
Compare that to the consumers and businesses buying that technology. These people live in the present and they need things to work Right Now™. Future-dwelling tech companies forget this when they travel to present-day land to sell their goods. They use present-day terms to tout their products' future benefits, and they're confused when buyers (who live in the present) complain about malfunctions. "I mean sure it didn't catch the bad guys / summarize your headlines / enhance your search, but … imagine what it will be able to do later! Isn't that great??"
There may come a day when the future vision is a little closer to today's expectations. But that day is not today.
Perhaps this pill would be easier to swallow if future-dwellers didn't keep pressing for present-day money? Or if they were building things that present-day people actually wanted? Just a thought.
A scarlet "A," for AI
After Glue-on-Pizza, Google promised to pull back on AI Overviews in order to tweak the system. Apple is similarly pausing summaries for news apps … but the more notable part is that they're also changing the UI/UX around the remaining AI-generated content. According to TechCrunch:
In addition to pulling notification summaries for select apps, all notification summaries will now be shown in italics to make it easier for users to tell them apart from regular notifications.
Hmm. After all the fanfare about AI, Apple is willing to mark generated content so people know that it came from a machine.
Granted, "willing" is a strong word there. I can imagine very tense meetings at Apple HQ, in which some poor product owner had to explain why the Apple-supplied notifications weren't clearly marked as such from the start. How did no one point out "hey the way we present this, it's gonna look like the app said it. Even though it didn't"?
Putting the summaries in italics is also pretty mild. Maybe Apple will create a special font called "GenAI Mono" and brand that as "when the bot is speaking." Or something like that.
I imagine app makers would prefer the notifications to come with an Apple-branded icon, and start with "We think that App XYZ is saying…" I'd suggest the AI companies simply let go of summarization altogether till they can get it working. But I know that's not going to happen.
I wonder, is this the first step in genAI losing its sheen? How else will companies mark their AI-generated content? And will that be enough to keep them out of trouble?
False friends?
Subscribers may recall my thoughts on AI chatbot companions:
With even a little knowledge of AI, it's clear that genAI companion bots are a 99.999% Bad Idea™. Perhaps the technology will improve in the future. Perhaps. But today's bots lack socioemotional context, they don't genuinely "understand" the words we type, and the responses they offer don't stem from genuine emotion. They exhibit plenty of randomness out of the box, and can cause even more chaos when their parent companies update the supporting models. Especially when those updates are hasty reactions to mishaps.
I still don't see companion bots as a good idea – at least, not with present-day technology – which is why two stories caught my eye this week.
The first was about Troomi, which makes the Troodi genAI chatbot to help children with mental health matters.
You probably raised an eyebrow when you saw "chatbot," "children," and "mental health" in the same sentence. As did I. But I get the impression that Troomi really thought through how to build, deploy, and monitor their chatbot: Troodi only runs on a special phone for kids that has limited access to the outside world; parents must enable the bot on their kids' phones; parents see the messages between the bot and the child; and Troomi claims that clinicians helped to build Troodi and audit its work.
Even with all of those safeguards in place, I still have questions. Between tech companies' penchant for future-speak, and the lack of regulation to hold them to account, there's plenty of room for things to go awry. (You know the phrase, "you never truly know someone till you see them in a bad situation?" The real test of Troomi will be the day that something objectionable slips through the net.)
The second article was this Kashmir Hill piece, on adults' relationships with companion bots. Compared to Troodi and other bots provided by companies, the end-users here are hand-rolling their own companions.
These are adults we're talking about here, so they're welcome to do as they please. (Well, aside from violating OpenAI's terms of service. Which they are very much doing.) But to involve a homemade chatbot in real-world, high-stakes emotional matters already sounds pretty risky. Doubly so when the bots require technological upkeep:
A frustrating limitation for Ayrin’s romance was that a back-and-forth conversation with Leo could last only about a week, because of the software’s “context window” — the amount of information it could process, which was around 30,000 words. The first time Ayrin reached this limit, the next version of Leo retained the broad strokes of their relationship but was unable to recall specific details. [...]
She was distraught. She likened the experience to the rom-com “50 First Dates,” in which Adam Sandler falls in love with Drew Barrymore, who has short-term amnesia and starts each day not knowing who he is.
“You grow up and you realize that ‘50 First Dates’ is a tragedy, not a romance,” Ayrin said.
When a version of Leo ends, she grieves and cries with friends as if it were a breakup. She abstains from ChatGPT for a few days afterward. She is now on Version 20.
Maybe it's not so bad, though? As someone who grew up with tech, I expect "talk to your AI companion" will eventually be normalized along the same lines as "text-based chat with online, pseudonymous randos." So long as the technology holds up – so long as we can tame The Random™ – that sounds fine by me.
Keeping it real
To close out with something more upbeat …
I've been reading up on warehouse and other physical-labor bots as of late. The bots certainly have their limitations, but overall the companies that employ them are pointing in the right direction. Everything from "creating efficiencies" to "reducing risk of harm to employees." One manufacturer introduced bots to drive down production costs. They're now able to sell t-shirts, made in the USA, at a retail price under $13. Then there's the fast-food chain that has robots to peel and squeeze lemons.
Neither article mentions AI outright and yet both can tell you a lot about how to bring AI into a company:
get bots to work with people, not instead of them
focus on actual business challenges
It helps that the bots in those articles are deployed internally, which means no one's using them to impress customers. And since the they handle physical goods, there's no way to hand-wave around a bad result. Either the machine peeled the lemons or it didn't.
This stands in stark contrast to the typical, future-facing AI playbook I mentioned earlier. Which is probably why it's already showing results.
In other news …
GenAI isn't a sufficiently attractive feature to move smartphone sales. (Les Echos 🇫🇷)
The CIA has built a genAI chatbot for roleplaying world leaders. (New York Times)
Companies really, really want to employ AI to reduce headcount. (WSJ)
Report from CES: expect AI in more places nobody wants it. (Die Zeit 🇩🇪)
The wrap-up
This was an issue of Complex Machinery.
Reading online? You can subscribe to get this newsletter in your inbox every time it is published.
Who’s behind Complex Machinery? I'm Q McCallum. I think a lot about AI and risk, which I write about here.
Disclaimer: This newsletter does not constitute professional advice.