#022 - Does this count?
When the computer says it happened, it happened. Right?
You're reading Complex Machinery, a newsletter about risk, AI, and related topics. (You can also subscribe to get this newsletter in your inbox.)
What follows is the segment – more like "short novel," really – that would have pushed the previous newsletter well over the word limit. My decision to postpone turned out well: it gave me a chance to point readers to someone else's work. You'll see what I mean in a moment.
—
Flawed data is an underappreciated risk in AI systems. Even with the fanciest modeling techniques at your disposal, a small error that worms its way into the training data can trigger a butterfly effect of skewed predictions down the road.
Manual data collection has a well-earned reputation as a source of errors – everything from typos to people misusing fields to poor form design. Then you have upstream data vendors, who either mistakenly or intentionally corrupt the data they provide you. And let's not forget your internal systems. A code bug here or there can quietly scribble problems across your business data for years before anyone notices.
(The irony is that companies turn to vendors or develop internal systems precisely to avoid the problems of manual data collection. There's always a tradeoff. But I digress …)
Today I'm exploring internal data collection problems through the lens of a YouTube experience. When I was preparing this segment for the previous newsletter, Randy Au was writing about "antagonistic" data from external sources. We managed to achieve topic overlap without any coordination between us. The idea of untrustworthy data must have been in the air.
With technology as your witness ….
YouTube has recently introduced (or maybe, "I just recently noticed") a feature called Continue Watching. It's a way to, well, continue watching videos that you didn't finish. I scrolled through it and noticed that of the videos listed … several of them I had never watched. Hadn't even seen the thumbnails in my feed.
The videos were all from channels I had watched before. So for a moment I figured YouTube had adopted a very loose definition of "continue." But one of the thumbnails sported a progress bar indicating I had watched a good quarter of a video that I had never clicked. Which tells me that this was, in fact, meant to let me pick up a video where I'd left off.
Except I was being offered to pick up from a place I'd never been.
Hmm.
There was nothing scandalous in there, mind you. But I still wondered how these mis-counted views might feed into other systems. Would they drive future video recommendations? Or advertising? Or anything else?
Misplaced faith
On the one hand, I've worked in the tech space long enough to write this off as a simple glitch.
On the other hand, I've worked in tech long enough to know that society holds computerized records in high regard. Consider:
We believe what's been written. We've been conditioned to expect our every click, webpage load, screen tap, and ebook swipe will be saved to a database, turning providers' computers into low-grade systems of record. We then tell ourselves that such an action would only land in a database if it had actually happened. Right? Maybe?
Maybe not.
Consider the UK Post Office's Horizon scandal. Errors in a new accounting system falsely accused subpostmasters of theft, leading to more than 900 wrongful convictions. Those who avoided prison terms still had their lives turned upside-down by the investigations and unwanted attention. All in the name of The Computer Must Be Right.
We accept that we'll only get one side of the story. It's reported that the Horizon problem lasted for so long because Post Office officials chose to cover up known flaws in the system. The understated flip-side of that coin is that the subpostmasters had no way to independently investigate the system that had accused them of wrongdoing. The official word did not match reality, but the courts went with it anyway.
We put faith in what the machine has predicted. As an accounting system, Horizon provided sums, roll-ups, and breakdowns of transactions – the simple arithmetic of business intelligence (BI) tools. Modern-day AI adds a new dimension to We Believe The Computer. People grant those predictions extra reverence because they come out of the magic black box.
That takes me to a story I shared in the previous issue, about hospitals using genAI systems to transcribe doctor/patient interactions. Those transcription errors were easy to spot because they were so wide of the mark. What happens when newer models create still-erroneous yet more-believable transcriptions?
Errors propagate. And compound. Errors from one model will feed into another model, which will in turn feed into other tools, and so on. Those AI-based transcriptions pave the way for a Kafkaesque medical experience, in which The System pushes us into treatments for ailments we don't have while denying treatments for what we actually suffer.
It's all subject to manipulation. Our increased reliance on machines – from plain old BI, to dashboards, to AI models – increases the risk of people tampering with upstream processes to influence outcomes. Last month ByteDance, TikTok's parent company, allegedly fired an intern for poisoning their AI models. Researchers have noted the rise of generative engine optimisation (GEO), a cousin of search engine optimisation (SEO) which attempts to influence what winds up in genAI search summaries. Shady-yet-still-technically-legal GEO helps marketers self-promote their wares. But if the practice has reached the licit business world, it's guaranteed that bad actors are using it for darker purposes.
For a slightly more altruistic form of data manipulation, some Redditors poisoned Google's AI search results by creating fake restaurant reviews. Their goal? To keep social media influencers and tourists out of their favorite spots.
A healthy sense of doubt
So can we believe the computers when they claim an event has taken place?
I'll give that a no-and-yes.
No, because computers are vulnerable to tampering – beyond the aforementioned data poisoning, people can deliberately embed social bias or modify records after the fact. That's above and beyond the bugs that exist in all software. Data pipelines included. And data display, as well. A simple calculation error can alter figures on the way to your app screen or browser tab.
Yes, because digital record-keeping is the only solution for a world that collects this much data. It has its flaws. But so does every other possible system. We're stuck with computers as systems of record until something better comes along.
The key to surviving this world is to understand the flaws and act accordingly. You know how experienced data practitioners second-guess every analysis and every model test? Especially when the results look too clean? You're welcome to borrow that idea. Consider this permission to express doubt when all we have to go on is "well this is what the computer says."
We can do more than investigate the raw data behind an automated decision or action; we can demand an explanation of how that data came to be, how it was transformed, and how it was stored. If we do that, the black box of digital data collection gets a little more gray.
Remember: computers only have an air of authority because we give it to them.
In other news …
I recently published a short post called "You're terrible at AI". I think you'll enjoy it.
Wall Street Journal readers shared how they're using genAI for work and personal matters. And I have to say, I find the list … a little worrying in places? This feels like a lot of trust to place in a chatbot. (WSJ)
Remember that Depeche Mode song, "Personal Jesus?" AI-backed confessional takes it too far. I'll have more to say about this next time. (The Independent) (There's also a video from German news site DW. 🇩🇪)
Anthropic CEO Dario Amodei calls for mandatory AI model testing. (Bloomberg – via Yahoo! news)
Microsoft makes it easier to switch between AI models. This might not sound like much, but AI infrastructure is a space that deserves far more attention than it currently gets. (Bloomberg)
Amazon searches through the couch cushions to hand Anthropic $4B in investment. (New York Times)
"A bot walks into an online store…" AI meets e-commerce. (Der Spiegel 🇩🇪)
An interview with Marie-Laure Denis, head of France's data regulator CNIL. (Le Monde 🇫🇷)
Need an AI clone of yourself? (MIT Technology Review)
The wrap-up
This was an issue of Complex Machinery.
Reading this online? You can subscribe to get this newsletter in your inbox every time it is published.
Who’s behind Complex Machinery? I'm Q McCallum. I think a lot about AI and risk, which I write about here.
Disclaimer: This newsletter does not constitute professional advice.