Generative AI has made impressive technical progress over the last few years. This isn’t to say that it’s anything like the revolution that its biggest boosters claim it to be. The transformer models underlying modern Large Language Models (LLMs) and generative AI systems such as AI image generators are impressive achievements of science and engineering.
One specific issue I have with their most prominent cheerleaders is the notion that what these AI systems produce is “art”. These systems cannot create art, they can only replicate aesthetics. This may seem like a strange distinction, but it’s a critical one.
For one thing, art is, fundamentally, a communicative act. Even though we’re supposed all be good postmodernists who believe in the death of the author, art still presupposes an artist and an audience. Art communicates human emotion and experience, which LLMs are incapable of having. They may render the experiences of the people prompting them, but even this mediated experience won’t be a reflection of what was meant to be expressed. The most imperfect human author is at least grasping at their own experience.
The best case for AI art is the “prompter as artist” idea. The idea that, via repetition and prompt updates and requested tweaks, a prompter may create a “good enough” aesthetic object via curation. Even this best case equates AI “art”, in essence, to pushing “play” on a recording a of a mashup of other people’s work.
The deeper reason why generative AI cannot make art is that it fundamentally seeks average aesthetic style. This is why there exists a quickly-recognized “LLM voice” which differs only slightly between models. Generative AI systems work based on predicting the next most likely token in a series of tokens. (Token here means, essentially, a quanta of expression. The next word or pixel or frame.) It is, quite literally, doing the same fundamental thing that the autocomplete on your smart phone keyboard is doing. It’s just doing it better and for a wider range of possible media.
It decides on this next token by assessing the “weights” in a large internal matrix of possible next tokens. It derives these weights from training over a large corpus (the size of the Internet, roughly) of text and images. It is then fine-tuned and shaped by humans. This reliance on a large body of human data and the fact that it’s predicting the next token based on probability means that it stylistically regresses to the mean. It is a distillation of human artifacts and produces the “most likely” such artifact based on a prompt.
The author Theodore Sturgeon, during an interview, was asked why he wrote science fiction when “95% of it is crap”. He responded that 95% of everything is crap. Aesthetically, the average of crap is crap. AIs are a median-locator trained over a large body of mostly uninteresting stuff.
AI gets a little more interesting when it’s provided with aesthetic clues. If you tell it to write a KFC ad in the style of Thomas Pynchon, it can quickly produce a halfway decent pastiche. Hang on a sec, I’m going to go ask an AI (I won’t tell you which) to write that for me.
Okay, it made a pretty boring, generic screen play for some kind of KFC-as-cosmic-conspiracy ad. The only decent line it was “They tell you it’s ‘finger lickin’ good.’ But whose fingers? And what are they licking?” Even that is only funny accidentally, because it has confused object for subject in the slogan. This turn of phrase isn’t even particularly Pynchonian. It’s like it mapped “Pynchon” to “weird paranoia” and then spat out something a halfway talented 16-year-old goth kid would put together for an essay late the night before it was due.
In short, it created an aesthetic object. Something that has the hallmarks of a certain genre or kind of art. Rather than use those elements as a framework to express something, it just assembles them like the chitin of an empty carapace. These shells are not at all interesting. They won’t teach us anything or push art forward in any way. They’re artistically useless. Or, if they trick people into thinking that they expressing something, they might be worse than useless.
In the coming months and years we’ll see a lot of efforts to force AI into the realm of artistic expression. Not just into what Murakami calls “shoveling cultural snow” (copywriting, ads, stock photography), but expressive art that is meant to connect humans to one another. Treat these efforts with the utmost disdain. Do not accept cobbled together, average aesthetic objects in the place of actual art.
I know that I’m not the only one making this point. Many people are highlighting this distinction from different angles. I think it’s worth examining in this light exactly because hollow aesthetics are intimately wrapped up in the rise of Fascism. I’ll write more about this in the future, but Fascists do not have a morality. They have a mythology and they have an aesthetic. They make all of their decisions for either mythological or aesthetic reasons. This is why fascists love modern AI. To them, it represents victory of aesthetics over art. The final triumph of surface-level appearances over human meaning. After all, if art still matters, than humans must still matter. A fascist cannot accept human worth.
So resistance to the notion that AI generated objects are “art” is, I think, one avenue of resistance to fascism. A small one, to be sure, but an important one.
I’ll close by quoting from Umberto Eco’s “Ur-Fascism”:
…the early Italian Futurists were nationalist; they favored Italian participation in the First World War for aesthetic reasons; they celebrated speed, violence, and risk, all of which somehow seemed to connect with the fascist cult of youth. While fascism identified itself with the Roman Empire and rediscovered rural traditions, Marinetti … proclaimed that a car was more beautiful than the Victory of Samothrace, and wanted to kill even the moonlight…"