AI Keeps Putting Katanas in Ancient Rome

And It’s Not Random

A historically inaccurate gladiator displays the wrong physique and is wielding a katana

By L. M. Hawkes · HawkesAdventures.com

Four months ago I started building a catalog of cinematic historical illustrations for a series of interactive gamebooks set in ancient Roman gladiator culture. The illustrations are AI-generated through Midjourney. The historical accuracy standard is not negotiable.

Those two facts turned out to be in significant tension.

Left to its own devices, Midjourney – and generative AI image tools generally – produces imagery that looks historical until you look carefully. Then you start noticing things. A legionary carrying a katana. A Roman forum ringed with Gothic pointed arches. An interior scene lit by Victorian gas-bracket lanterns with glass panels. Pipe runs along the upper walls of a barracks that should have been built in 50 AD.

Over the course of auditing 1,106 images – yes, eleven hundred – I documented exactly what goes wrong, how consistently it goes wrong, and what you can actually do about it. This is the first article in a series covering the complete workflow, including the exact prompts and the infrastructure behind the catalog.

But before the solutions, you need to understand the problem. And the most important thing to understand about AI historical failures is this:

They are not random. They are patterned, predictable, and systematic – which means they are fixable.

The Katana Problem

This is where the series gets its name, because nothing illustrates the core failure more vividly.

A Roman legionary in full lorica segmentata, standing in the Forum, holding a katana.

It happens. It happens more than once. It happened to me eleven times across 1,106 images, and every single occurrence followed the same logic: the model associated warrior and sword and reached for whatever sword archetype dominates its training data. For reasons that probably reflect the sheer volume of Japanese-inspired content in that training data, the answer is frequently a katana.

Prompting explicitly for gladius, for Roman short sword, for period-accurate weaponry reduces it. It does not eliminate it. The model has a default and it will return to that default unless you build specific, persistent constraints against it.

The katana is the most jarring example, but it points to a broader pattern: AI does not know the difference between ancient warrior and medieval warrior or fantasy warrior or feudal Japanese warrior unless you force it to. And even then, it forgets.

The Sword on the Back

Related to the katana problem, and arguably more pervasive: swords worn on the back rather than the hip.

Roman soldiers wore their gladius on the hip or thigh. This is documented, consistent, and not ambiguous. AI defaults to the dramatic over-the-shoulder carry – the drawn-from-behind-the-head position that looks good in fantasy art and action films and is historically wrong for virtually every ancient culture.

The back-carry is a cinematic invention. It persists in AI output because it persists in the training data – in fantasy illustration, in Hollywood imagery, in video game character design. The model has learned that dramatic warrior with sword means sword on back, and it takes explicit, targeted negative prompting to break that association.

This is a small detail. It is also an immediate tell to anyone who knows Roman history. And the community of people who know Roman history – and care about it – is larger, more vocal, and more active than most content creators realize.

Why the Failures Are Patterned

Understanding why AI produces these specific failures consistently is more useful than cataloging the failures themselves.

The model’s training data contains far more post-Roman content than Roman content. Medieval Europe, Renaissance Italy, and Victorian England generated enormous volumes of imagery – paintings, engravings, illustrations, photographs of architecture, museum collections – that dwarf the documentary record of ancient Rome. When the model reaches for warrior, fortress, interior scene, or dramatic lighting, it draws from the richest available pool. That pool skews late.

The result is a set of predictable anachronistic contaminations:

Weapons default to whatever sword archetype dominates the training data – frequently katanas, sometimes longswords or rapiers, almost never the historically correct gladius
Armor defaults to a “generic warrior” template that skews Medieval – enclosed visors, Gothic pauldrons, articulated gauntlets – rather than the open-faced Roman galea and lorica segmentata
Architecture defaults to pointed Gothic arches, which appear everywhere in the training data, rather than the round Roman arch that is architecturally correct and historically specific
Lighting defaults to a “dramatic interior” template drawn from 19th century references – Victorian gas-bracket lanterns with glass panels, visible pipe runs – rather than oil lamps and torch sconces

None of this is the model malfunctioning. It is the model doing exactly what it was trained to do: producing imagery that matches the statistical weight of its training data. The problem is that statistical weight does not map to historical accuracy.

The Stakes

You might reasonably ask: does this actually matter? Who notices a Gothic arch in the background of a Roman interior scene?

The answer, documented from active community monitoring: a lot of people notice, and they say so publicly.

In February 2026, a thread on r/aiArt titled “What makes these images of Ancient Rome historically inaccurate?” attracted immediate, detailed community response cataloging specific failures. A University of Bordeaux art history professor formally documented anachronisms in widely-shared AI Rome videos in December 2025. Threads on r/ancientrome and dedicated gladiator history forums detect equipment mixing, armor anachronisms, and architectural errors in AI-generated content within hours of posting.

The community of people who care about Roman historical accuracy is not niche. It is active, literate, and increasingly frustrated by AI-generated content that looks historical without being historical. One highly-upvoted comment put it plainly: AI shows what people think Rome looked like, not what it actually looked like.

That frustration is a gap. The work I’ve been doing is designed to close it.

What Comes Next

Across 1,106 images, I documented nine specific failure categories – weapons and armor are two of them. The remaining seven cover gladiator body type, architectural accuracy, lighting sources, materials and construction, gladiator equipment mixing, civilization drift, and social class collapse. Each one follows the same pattern: a predictable AI default that is historically wrong, consistently generated, and specifically correctable.

The next article in this series covers the full taxonomy – everything AI gets wrong about ancient Rome, with the community evidence to prove it isn’t just my opinion.

After that: the three-stage correction pipeline I built to fix it, the full prompt architecture (with an annotated version here and the complete production-ready documents available at HawkesAdventures.com), and the database infrastructure that turned a generation workflow into a structured commercial catalog.

The AI isn’t broken. It just needs to be told, specifically and repeatedly, that Rome and feudal Japan are separated by thousands of miles and several centuries.

It will listen. Eventually.

L. M. Hawkes writes cinematic, historically grounded interactive gamebooks drawing from the warrior traditions of Rome, Greece, Japan, the Viking Age, and the great battles of antiquity. The Vault of Ages Art Pack Configurator – a curated catalogue of historically accurate cinematic illustration – is available at HawkesAdventures.com under personal and commercial licenses.

This is Part 1 of a 6-part series.

Coming next week, Part 2: The White Marble Lie – the full taxonomy of what AI gets wrong about ancient Rome.

Tags: Artificial Intelligence · History · Midjourney · Game Design · Historical Fiction · Ancient Rome · Worldbuilding · Prompt Engineering

AI Keeps Putting Katanas in Ancient Rome

And It’s Not Random

The Katana Problem

The Sword on the Back

Why the Failures Are Patterned

The Stakes

What Comes Next

The White Marble Lie

The Full Prompts

The Database Behind the Art

The Three-Stage Correction Pipeline

L. M. Hawkes

AI Keeps Putting Katanas in Ancient Rome

And It’s Not Random

The Katana Problem

The Sword on the Back

Why the Failures Are Patterned

The Stakes

What Comes Next

Similar Posts

L. M. Hawkes