Writing about AI is broken

We need to stop obsessing over new models and the big companies that produce them

Jul 18, 2024

Earlier this week, a colleague sent me an article about AI from an industry-specific publication. I could not shake the feeling I had read it before, several times. The article itself really does not matter. There was nothing inherently wrong with it, but it epitomized a kind of formulaic writing about AI that is starting to wear on me. It reminds me of watching a summer blockbuster or a holiday movie and noticing that not only do they have the same plot structure, their heroes and villains, their conflicts and MacGuffins are all barely customized.

It amounts to a kind of AI washing, but for journalism. I see this especially in industry-specific publications, where the authors and editors are experts in their sector but have no particular expertise in AI or machine learning. Like everyone else, though, they are now obligated to publish about AI whether or not they have anything new to say about it. In most cases, I think they end up telling the same kinds stories that that they are used to telling, but now with more AI.

This will probably be controversial, but I think we are spending too much time worrying about specific AI models and the small number very wealthy companies with the computing power and financial resources to train them. Without singling out any specific articles, I am referring to the larger metanarrative that has fixated on questions of equity, access, and representation in huge models trained and operated by huge platform companies like Google, Meta, or OpenAI.

That metanarrative basically adopts the tropes and characters from the past decade and assumes that AI will just exacerbate all of the existing inequalities and injustices, rewarding entrenched players and further disenfranchising everyone else. I get it, Google, Meta, Amazon, and the other giant platforms mediate much of the world to us and they do it all with minimal regulation. No one wants to let those big baddies do it again!

Ironically, this fixation on the villains we missed last time we are effectively repeating the mistake that made us miss their rise to prominence. Yes, we need to tell careful, critical stories about how existing power structures and inequalities will continue into this new era. I just think we need to learn to tell other stories, too. Otherwise, we are fighting the last war. That is how we got here. Let me explain.

When Facebook began in 2004, we were barely beyond the dot com bubble. Making money with a website had barely ceased to be a joke, but people knew the internet would inevitably matter. The millennium had begun with a culmination of a long-fought antitrust suit meant to break up Microsoft. The DOJ had successfully demonstrated that the company had used its effective monopoly over PC operating systems to force people to use Internet Explorer. Well, not force exactly. But they did put an icon for Internet Explorer on the desktop of computers running Windows. That aggressive power play had basically won the browser wars, destroying the untouchable incumbent Netscape Navigator.

In the face of such evil, who was going to worry about a free website that let college kids poke each other and share photos? Google was also free and just so good at what it did. Plus, they promised not to be evil! Then they gave us gmail! Anyone who remembers clunky webmail with tiny storage limits will likely remember what it was like to get an invitation to join gmail. Those companies were giving away free cool services. How could they be villains?!

People were so obsessed with the untouchable tech giants slugging it out over the future of the internet by fighting over the web browser that they missed the rise of the web platform, which would go on to become the internet for so many people. Platforms like Facebook, YouTube, Google, or Twitter came to mediate the internet to people. They also mediated those people back to the internet by letting anyone create a free and easy web presence.1

Over the past decade or so, we have collectively learned the lesson that, when the product is free, we are the product. That mantra is now so commonplace I cannot actually source its origin at the moment. In the world we did not then know we were making, social media platforms created a new kind of oil. This oil, as I wrote last time, was valuable because it was ubiquitous rather than rare. Those platforms created it from engagement that people freely gave to them, but it took a massive amount of computing power to refine all those data points for all those people into a new fuel for online commerce. Using machine learning models that would optimize for any engagement, those platforms have driven systemic injustices, radicalized people with hate, and fueled online abuse and real-world violence.

As we collectively look at AI, recognizing its inevitable significance, no one wants to repeat that mistake. This time we will see them coming!

As a result, we have told stories that rely on the cast of characters and the tropes that emerged from social media and the era of big data. Breathless reporting has fixated on the sheer magnitude of each new model and of the dataset on which it was trained. Think pieces obsess over the cost-prohibitive nature of training the models, focusing on the big giants slugging it out in the ~~browser wars~~ AI model wars.

In all of this, it is not even always clear what should worry us. Should we be opting out of training data for our privacy to avoid exploitation or should we be worried that datasets will underrepresent vulnerable groups, thus perpetuating systemic biases? Should we worry about a few big companies controlling the dominant models or should we worry that open source models are freely and widely available for people to do anything, even bad things? All of those are real questions motivated by legitimate concerns, to be sure. So was the power Microsoft illegally used to eliminate its competition in the browser wars.

They just are not the only problems, and they might not be the problems we wish we had anticipated in the future. Unfortunately, I think they are just the only problems we can make sense of in our current metanarrative about the threats of technology.

All of that relies on the faulty assumption that size will be all that matters going forward. There is no evidence to support that assumption. In fact, a mounting body of evidence suggests that small, focused models have more potential than gigantic general models.

Plus, there are good reasons to believe we will not get more data, at least of certain types. We have already passed the peak of human writing. There will never be substantially more human-written text than there is right now, at least not text we are absolutely certain was entirely written by humans. From now on, much of the internet will be AI-generated. That is why Facebook considered buying the publisher Simon & Schuster, for its back catalog, to pick just one example.

If the goal is to produce models that write like humans, then those models have to be trained on the genuine article. If the training set is contaminated with AI-generated texts, then the resulting model will become less capable of writing like humans. Not all models need to write like humans, though. That fact changes the game dramatically.

Meta used “synthetic data” to train its capable new open source model Llama 2 to write computer code. That means researchers used AI models to generate a corpus of computer code and then used that machine-generated code to train the model how to write even more code. That works because researchers were not optimizing the model to write code like humans do; they were optimizing the model to write good code.

From these couple of examples, hopefully you can already see why smaller, task-specific models will soon outpace massive models trained on everything. Hopefully you can also see why we should expect to see models get much better at replacing humans in the situations where humans are currently writing for computers than in roles where humans write like humans for other humans. Data for training models to write Python, SQL, or Go will grow as AI models get better and better at generating those things; meanwhile, the available data for training models to write like I am writing here has already passed the peak of its production.

Right now, we are interfacing directly with a piece of technology. Chatting with GPT-4o is like strapping a rocket to your back in hopes of flying to the moon. OK, maybe it is more like building a minimalistic car around the newest and biggest engine in pursuit of the land speed record. The bigger the engine, the further and faster it goes.

**Image generated by Google Gemini with the prompt: “An old-fashioned ACME style cartoon of a character strapped to a rocket lighting it.”**

To date, the story has largely been about how much more powerful each engine was compared to the previous generation. There is no reason to expect that to be the story that continues to matter. At some point, the engines are good enough. People like to be able to steer. They like stereos and air conditioning. They care about fuel economy and reliability. The contest will be between car manufacturers who can build a fleet of practical vehicles around engines tuned to their purpose. How often do you think about the engine in your car? Probably only when it malfunctions, unless you are a total gear-head.

We should be watching the companies that are applying models—big or small, general or hyper-specific—to new problems in software applications that let people offload their existing work. Those might be new startups or entrenched companies like Apple, which recently announced that generative AI would be invisibly involved in nearly everything an iPhone can do. Much of that will rely on models small enough to run on-device!

The big, established companies with lots of money are going to keep doing what big, established companies with lots of money do. They are going to keep using the resources available to them from their past success to compete with other established players by brute force, trying to build bigger and more powerful models.

But it is worth noting that those entrenched players are already playing catch-up. They are using their massive cash and computing power to try to close the distance between them and OpenAI, an unexpected newcomer that blew past them using technology invented at Google. We immediately added OpenAI to the pantheon rather than questioning whether a new era had dawned.

I think most people writing about AI right now are currently fighting the last war. In that war, the size of the data—and the compute power to use it—reigned supreme. Those elements are still present, of course. It took big data and huge compute power to create the models that power all forms of generative AI. It took a big company giving us access to a large, capable, general model through a chatbot for a hundred million people to recognize in just a matter of weeks that an LLM could be applied to much of the work they had to do every day. That got our attention, but it does not necessarily deserve to keep it.

But that new technology changed the rules of engagement. This is a new kind of war and we need to learn to tell a new kind of story about it. We need to look for a new cast of characters, for new sources of action and conflict.

We could start by looking less at the size of the engines coming out every month and the companies producing those behemoths, whether anyone needs them or not. We should be watching for the companies that are packaging engines of the proper size into efficient, dependable vehicles that lots of people need.

For now, I will leave that to future posts.

I am deeply indebted here to Margaret O’Mara’s recent book The Code. If you found this section interesting, infuriating, or both, you should read it. You should read it anyway. Everyone should. Margaret Pugh O’Mara, The Code: Silicon Valley and the Remaking of America (New York: Penguin Press, 2019).

Writing about AI is broken

We need to stop obsessing over new models and the big companies that produce them

Discussion about this post