Complex Machinery

Archives
Subscribe
June 25, 2026

#064 - What does genAI do well?

In search of nails for the world's most popular hammer.

You're reading Complex Machinery, a newsletter about risk, AI, and related topics. (You can also subscribe to get this newsletter in your inbox.)

Three metal nails, one of which is bent, on a rusty surface.  Photo by Varshil Changani on Unsplash
(Photo by Varshil Changani on Unsplash)

Web3 was the hottest technology just a few years ago. Individuals and companies alike were scrambling to get their piece of crypto, NFTs, and other blockchain-based products. I'd been exploring the space myself – eventually logging it all in a weekly newsletter – and in 2023 I penned an article on the hunt for web3's killer use case.

Having dethroned web3 shortly thereafter, genAI is now living that same story: it's the hot technology built on vague terminology and a mad dash for money. And just like web3, it's hounded by scams and ethical issues. Not to mention a quest for electricity, hardware, and government support.

I was planning to write about the hunt for genAI's killer use case, that One Big Thing that would drive adoption and leave us wondering how we ever lived without it. But it occurred to me that genAI is missing bread-and-butter use cases, as well. That's hardly a sin on its own. But it becomes a problem when you compare the excitement and investment in genAI to actual, road-tested uses.

What's in a name?

AI has long been a hazy, ill-defined term. For a while it was an umbrella for any kind of advanced data analysis – that mix of so-called "classical" machine learning techniques, home-rolled neural networks, and smaller-scale generative techniques like Markov bots and restricted Boltzmann machines.

When I pondered the next rebranding of the data field, it didn't occur to me that it would keep the name but change the underlying meaning. The newcomer soon erased the previous decade-plus of its predecessor, leading to confusing headlines that venture capitalists were just now excited about AI. "Wait, but, weren't they excited a few years ago? Right before crypto grabbed their attention?"

Even the newer definition of "generative AI" covers a lot of ground. We have interactive chatbots, image and text generators, agents (another term with vague meaning), and whatever else the field has cooked up by the time this article lands. For brevity I'll say that "genAI" is anything backed by an LLM, and that the unadorned "AI" covers both machine learning and the generative-LLM space.

The distinction is important. It's not just about model size. It's about what these technologies do.

Painting lanes

ML/AI models are mostly used for predictions. You hand a model a data record and, based on patterns in its training data, it will hand you a target value. This simple concept has been applied to document classification, price predictions, and a host of other things that people want and need to do.

Generative AI models are predictive models in reverse. You hand a model a target value and it gives you a data record that could have plausibly fit in with the rest of its training data. Once the novelty wore off of generating images, businesses started looking for other ways to use this reverse-ML approach.

The quest for use cases started by doubling back onto ML/AI's turf. Which might be one reason why fans of genAI are so eager to erase its predecessors. I remember executives bragging that genAI let them "unlock their unstructured data." Except ... ML/AI already could already do that. I know this because I've spent several years getting machines to make sense of text. The move from bag-of-words to neural nets provided a meaningful jump in performance, even when you factored in the added R&D and operational costs. Moving from homegrown neural nets to LLMs has not gone as well.

Generative AI only does a "better" job there in the sense that you get answers back faster. But since you didn't curate the training data and build the model yourself, you don't know what went into the answer. So if speed is your only metric you could skip LLMs altogether – coin flips would work wonders.

Just give me the short version

Summarization birthed the next set of use cases. This involved using genAI's underlying LLMs to boil a larger document into a few short sentences. Usually by skimming over important details. And that's if it even got the gist of the document, which it sometimes did not.

GenAI providers then professed that summarization counted as a search replacement. Their bots would respond to queries not by pointing you to source materials, but by parsing those materials for you and giving you the gist. This simultaneously amplified summarization's weak points and glossed over the fact that genAI was a subpar replacement for actual search – a technology with a long-standing, proven track record.

Companies also got excited about deploying summarization through interactive customer service chatbots. This struck me as cheating your way out of implementing a proper search system, which requires you to actually write and organize your documentation. In turn, that requires you to anticipate what people will want. Which requires doing your homework. While homework isn't as fun as playing with new technology, in this case it leads to more predictable results.

They keep digging

Use cases beyond summarization haven't proven much better.

There's a running (half-)joke that corporate genAI use cases start with the premise that you're an idiot. Which seems harsh, but only somewhat. Many of them sound like they're based on a caricature of someone who exists only in the mind of a desperate product manager.

We can more generously express this as a disconnect between AI providers and the audiences to whom they try to sell. Members of your typical genAI tech team – especially at the executive level – exist too far from their target market to build tools for them.

Consider Volvo, the car manufacturer. They want to put genAI into their newest vehicles. The use case? To let drivers chat with the manual. Quoting Alwin Bakkenes, head of global software engineering:

The AI agent knows exactly what car it’s in and has access to all of Volvo’s manuals and resources, as well as the greater Internet. It knows how to use the car and can explain it. “I want to understand how I share my digital key. I can open up a manual or something, but I can actually just ask, how do I share my digital key to a friend or to a valet? Or how do I charge? How do I open the charge lid? How do I do this, et cetera? And it just knows all of these things. So you can converse around it without going through the thick manual,” he explained.

While that doesn't sound terrible, it does sound like a good reason for Volvo to create a decent FAQ. One that's not based on a probabilistic machine which, despite best efforts, will still goof now and then.

For its next use case, Volvo used genAI to replace a web search and a tape measure:

Bakkenes shared examples of using the AI agent to find out if a particular model of TV was in stock at a nearby store, and whether it would fit in his car.

This was second only to the Google engineer who used Gemini to look up the size of his car's tires:

Since connecting my apps through Personal Intelligence, my daily life has gotten easier. For example, we needed new tires for our 2019 Honda minivan two weeks ago. Standing in line at the shop, I realized I didn't know the tire size. I asked Gemini. These days any chatbot can find these tire specs, but Gemini went further. It suggested different options: one for daily driving and another for all-weather conditions, referencing our family road trips to Oklahoma found in Google Photos.

Not to be outdone, Google is excited to apply genAI agents to retail:

"For years, online shopping has been about keywords, filters, drop-down menus. And scrolling through multiple pages [of search results] until you find what you want," Google chief executive Sundar Pichai, who was joined on stage by Walmart's incoming boss John Furner, told his audience at the show. "Now... Al can do the hard work."

"Hard work"? Really?

I try to draw parallels to the shift from in-store to e-commerce. But it's not the same. Not by a long shot. Having to get in a car and drive to a store during certain hours was a real pain; I can sift through online search results while sitting on my couch with one hand tied behind my back. Plus, browsing is useful for getting ideas. The bots aren't much help here.

Unevenly distributed

To be fair, genAI has a couple of bright spots. The first is AI-based code generation. Experienced software developers with leadership experience are able to treat the bots as interns, submitting specs and cleaning up the outputs. It's quite a time-saver.

But if you step outside of that narrow definition – when the bots are in service to someone less experienced, from the entry-level developer to the hobbyist who has never done this for a living – the gains quickly fall off. That's because a successful software development project involves so much more than cranking out code. Automating the creation of code, then, is of little value without the experience of deploying apps to production, and without a strong body of SDLC best practices to support the effort.

The second bright spot is not a pleasant one. Generative AI now powers all kinds of crime, everything from mild fakery to outright fraud. And that doesn't include a January incident in which Grok briefly turned into a large-scale CSAM-on-demand factory.

What crime and software have in common is that the end-users are focused on what the technology does right now. The dreamers are still stuck on what they want, hope, or wish genAI could do. Which is why they keep forcing it into roles where it is not a good fit.

Why it matters

GenAI is constantly touted as a job-eraser. All automation eats work, so that's not too much of a surprise. Where we get into trouble is when genAI takes on jobs for which it's not quite ready. Rapid, widespread job replacement calls for tools that actually work. We should hold genAI-based tools to the same standard as the people they replace.

Purveyors and fans of genAI are quick to wave off these concerns. When pressed, some will acknowledge that it's not quite there yet, but then say that we just need to wait until the technology catches up. Fair enough. But if we need the tech to catch up, then it's not ready to do the job today.

It's also worth considering how much money has been poured into this technology. It started off with venture capitalists throwing cash at companies building foundation models. It's since grown to companies taking on massive financial debt to build datacenters. Those installations are driving up local electricity costs and otherwise causing frustration. Communities are pushing back. As they should.

Combined, all of this ups the ante. Unless genAI finds a sufficiently large killer use case, or a wide array of mid-level use cases, the combined financial and social debt is large enough to get ugly. And those datacenters won't simply evaporate.

Where to next?

People constantly point out that we're in the "picks and shovels" phase of genAI. And I would agree. But when you compare the money and effort that has gone into the picks and shovels (building AI capacity, developing foundation models) to the tangible benefit companies get from using AI, there's little there.

Far ahead of code and crime, genAI's most successful use case is "creating excitement." That'll hold vendors and the stock market for a while. But to borrow an old phrase, markets are frothy in the short term yet smooth out over the long run. Generative AI needs to find more and better use cases. Fast.

How will we find them? My usual take is to borrow ideas from other fields.

The first field is web3. Going back to that article I wrote about web3's killer app, the fashion industry stood out. Companies in that sector were exploring in a way that felt all of eager, honest, and realistic. I'll note that the established brands were careful to run web3 projects as experiments that lived off to the side. This caution protected the mainline business in the event the experiment fell apart.

A more subtle lesson from web3 is to build genuine interest. Say what you will about blockchain and its offspring, but people actually wanted it. NFTs became mockable precisely because they were popular. Crypto absorbed so much money because people saw some potential benefit. (Even if that benefit was dangerously close to the kind of gambling that stems from financial nihilism.) Generative AI is mostly foisted upon us by a small group of tech companies and their die-hard fans, many of whom seem to be using it for the sake of using it. It's no wonder we push back.

The second field is finance. The computerization of Wall Street taught us that the computers didn't win just because they were computers, but because computers happened to do the Wall Street job very well. That's a far cry from the way companies have cut headcount purely in the hope that genAI would be a fitting replacement. And the thing is, it usually hasn't been.

Years of computerized trading also taught traders how to develop risk controls, which opened them up to technology's upsides while avoiding its downsides. Adopting this mindset for genAI would increase the number of use cases, because we'd feel safer letting the bots take on meaningful responsibility.

Underscoring all of this is AI literacy. Executives who understand what genAI can and cannot do will be in a much stronger position to objectively evaluate the technology. Mix that with discipline, and the field will see fewer deployments but a greater percentage of successful deployments. That's how we win over the long haul.

For genAI's die-hard fans, I understand it can be hard to think about things like risk controls and discipline at the moment. But the best time to adopt a practical approach was yesterday. Today it can still work. We may not have a chance if we wait till tomorrow.

I'm reminded of something I wrote in Rebranding Data back in 2021. The data field has remained afloat in part because it keeps renaming itself before a correction hits. How many more name-changes do we get?

In other news …

For more links to recent news, and with a slightly broader scope, I encourage you to check out my other newsletter. It's a weekly, curated drop of what I've been reading.

The wrap-up

This was an issue of Complex Machinery.

Reading online? You can subscribe to get this newsletter in your inbox every time it is published.

Who’s behind Complex Machinery? I'm Q McCallum. I think a lot about AI and risk, and even wrote a book on it.

Disclaimer: This newsletter does not constitute professional advice.

Read more:

  • August 14, 2025

    #043 - Taming the delightful chaos

    What the computerization of Wall Street can teach us about AI

    Read article →
  • March 31, 2026

    #058 - A run-in with reality

    What happens when genAI meets the real world? It's not always pretty.

    Read article →
Don't miss what's next. Subscribe to Complex Machinery:
Share this email:
Share on Twitter Share on LinkedIn Share on Hacker News Share via email Share on Mastodon Share on Bluesky
Bluesky
Mastodon
LinkedIn