The Full Prompts

The Annotated Audit and Corrective Re-Prompting Architecture That Powers the Vault of Ages

Ancient Roman writing on a scroll with a bone stylus in the light of an oil lamp

By L. M. Hawkes · HawkesAdventures.com

The previous three articles in this series established the problem – nine systematic failure categories in AI-generated Roman imagery – and the three-stage pipeline built to correct them. This article delivers the prompt architecture itself.

What follows is the annotated version: enough to understand the design logic and implement the pipeline for your own catalog. The complete production-ready documents – the full Roman Historical Consistency Ruleset, the structured issue handoff format, the canonical failure tag vocabulary, and the corrective re-prompting template with all switches – are available as a free resource at HawkesAdventures.com.

If you haven’t read Articles 1 through 3, the short version is this: AI historical failures are patterned and systematic, which means they respond to structured, layered prompting rather than to longer or louder instructions. These prompts are the product of four months of iteration across 1,106 images. They are not theoretical. They were built by running them.

Why Prompt Architecture Matters

A prompt is not a wish. It is an instruction set, and instruction sets have architecture – structure that determines how reliably the instructions are followed and how gracefully the system handles edge cases.

Most AI image prompts are flat lists: a subject, a style, a few adjectives, some negative terms. That structure works for creative work where variation is acceptable. It does not work for accuracy-constrained work where specific failures need to be specifically excluded.

The audit prompt and the corrective re-prompting prompt described below are layered documents. Each layer has a distinct function. Understanding why each layer exists is as important as understanding what it says – because when you adapt these for a different historical period, you need to know which layers are period-specific and which are structural constants.

The Audit Prompt – Annotated

The audit prompt is submitted to Claude with a batch of up to 15 generated images attached. It runs every image through a structured evaluation sequence before any image qualifies for the catalog.

The prompt has five functional layers.

Layer 1: The Visual Anchoring Pass

Before any historical evaluation, the evaluator lists the primary visual elements actually present in each image – people, clothing and armor, weapons and equipment, architecture, environment, lighting.

Why this layer exists: It prevents evaluation of things that aren’t there. An AI evaluator will hallucinate violations against elements it assumes are present but cannot actually see. Forcing an explicit inventory of visible elements before evaluation begins grounds every subsequent judgment in observable evidence. You cannot flag what you cannot see.

What to keep when adapting: This layer is period-agnostic. It runs identically regardless of the historical setting.

Layer 2: The Evidence Confirmation Pass

Every suspected issue must be confirmed against clearly visible image evidence before it can be assigned as a failure. Suspicion is not sufficient. If a potential problem cannot be confirmed from visible pixels, it is marked uncertain rather than flagged.

Why this layer exists: It prevents false positives. Without this constraint, an evaluator will flag shadows as anachronistic lighting, obscured footwear as modern shoes, and blurred background architecture as Gothic arches. False positives waste correction cycles and erode trust in the evaluation system. This layer enforces the discipline: visible evidence overrides suspicion. Suspicion without visible evidence produces no rejection.

What to keep when adapting: This layer is structural, not period-specific. Keep it verbatim.

Layer 3: The Negative Knowledge Gate

The evaluator must distinguish between three states: visible violation, uncertain visibility, and not visible or not assessable. Elements that cannot be confidently assessed from visible pixels – footwear cropped out of frame, armor hidden by shadow, architecture partially obscured – are not flagged.

Why this layer exists: It prevents penalizing images for what they don’t show. An image with no visible footwear has not failed the footwear check – it simply has no footwear to evaluate. This distinction matters enormously at scale. Without it, the false positive rate rises to the point where the audit becomes an obstacle rather than a filter.

What to keep when adapting: Structural constant. Keep verbatim.

Layer 4: The Historical Consistency Ruleset

This is the period-specific core of the audit. For ancient Rome, it covers:

Lighting: Open flame only – torches, oil lamps, braziers. No electric lighting, no glass-paneled fixtures, no pipe runs or conduit.
Clothing: Roman garments only. No zippers, elastic fabrics, or modern tailoring.
Footwear: Roman caligae or bare feet only.
Armor: Roman armor types only. No medieval plate, no enclosed visors, no Gothic pauldrons.
Weapons: Gladius or spatha only, worn on the hip. No katanas, longswords, rapiers, or non-Roman blades.
Architecture: Round Roman arches only. No Gothic pointed arches, no medieval fortifications, no Renaissance facades. No flat transparent glass.
Symbols: No crosses, no Christian iconography, no heraldry, no Arabic numerals.
Materials: No plastic, chrome finishes, synthetic fabrics, or modern machined hardware.
Hairstyle: Roman male styles only. No modern fades, undercuts, or gel-styled hair.

Why this layer exists: It is the accuracy standard against which every image is evaluated. Every rule corresponds to a documented failure category from the audit history.

What to adapt: This entire layer is period-specific. For Viking Age content, the ruleset changes entirely – different armor types, different architectural forms, different lighting constraints, different weapon categories. The structure of the layer stays the same; the content is replaced.

Layer 5: The Showcase-Worthiness Rating

Images that pass all historical checks receive a 1–5 star rating based on composition, atmospheric conviction, and visual impact. The rating criteria are calibrated:

Five stars: Exceptional. Cinematic composition, convincing atmosphere, no visible artifacts. Could function as cover art or promotional material.
Four stars: Strong. Clear subject, convincing Roman atmosphere, minor imperfections that don’t distract. Suitable for catalog publication.
Three stars: Usable but not standout. Historically accurate, compositionally ordinary.
Two stars: Weak composition or noticeable flaws. Technically passes but visually mediocre.
One star: Barely usable. Technically passes historical checks, visually ineffective.

Only images that pass historical evaluation receive a rating. Rejected images are not rated.

Why this layer exists: Accuracy is the threshold. Quality is the filter. An image that is historically accurate but compositionally weak serves the catalog no better than an inaccurate one. The rating system separates the threshold from the standard.

The Corrective Re-Prompting Prompt – Annotated

The corrective re-prompting prompt takes three inputs – the original scene description, the audit output for that image, and the period-accuracy criteria – and outputs a refined Midjourney prompt with targeted negative specifications for exactly what went wrong.

The prompt has four functional components.

Component 1: The Style and Atmosphere Frame

Every corrective prompt opens with the same style and atmosphere instruction: photographic realism, bronze and ochre palette, gritty atmospheric lighting, the specific Midjourney switches (--ar 2:3 --chaos 5 --v 6 --q 2).

Why this component exists: It anchors the output in the visual language established across the entire catalog. Consistency in style and atmosphere is what makes a multi-hundred-image catalog read as a coherent body of work rather than a collection of unrelated images.

What to keep when adapting: The palette and atmosphere frame is specific to the Vault of Ages Roman catalog. For a Viking Age catalog, the palette would shift – cooler tones, different atmospheric qualities. The structural role of this component stays the same; the specific values change.

Component 2: The Accuracy Criteria Block

The corrective prompt carries the same core accuracy criteria as the audit – the period-specific ruleset translated from evaluation criteria into generative constraints. What the audit checks for, the corrective prompt explicitly excludes.

Why this component exists: It ensures that corrections don’t introduce new failures while fixing the documented ones. A prompt that fixes the Gothic arch but drops the lighting constraints will fix the architecture and re-introduce the Victorian lanterns.

Component 3: The Scene Text

The original scene description, unchanged. The correction addresses what went wrong with the execution – not with the subject matter.

Component 4: The Targeted Corrections

This is where the audit output translates into specific negative prompt language. Each documented failure becomes a precise visual exclusion:

Victorian bracket lanterns with glass panels → “Roman oil lamp on simple iron bracket, no glass panels, no pipe runs, no conduit, warm flickering flame only, no Victorian or post-Roman lighting elements”
Sword worn on back → “gladius worn on hip at right side, not carried on back, hip scabbard only”
Gothic arch in background → “round Roman arch only, no pointed arches, no Gothic structural elements, semicircular vault”
Enclosed visor helmet → “open-face Roman galea with cheek guards, no enclosed visor, no full-face coverage”

The specificity is the mechanism. Vague exclusions tell the model what category to avoid. Precise visual descriptions tell the model exactly what the failure looks like and what the correct version looks like instead. The model responds to the second kind of instruction far more reliably than the first.

Getting the Full Documents

The annotated versions above are sufficient to implement the pipeline. They contain the structural logic and the period-specific rules for ancient Rome.

The complete production-ready documents – with the full structured issue handoff format, the canonical failure tag vocabulary (sixteen tags covering every documented failure category), the complete evidence confirmation procedures, and the corrective re-prompting template formatted for direct use – are available as a free resource at HawkesAdventures.com.

These documents are the product of four months of iteration across 1,106 images. Use them, adapt them for your own historical period, and build the pipeline before you generate your first image – not after you’ve generated a thousand of them. That is the single most useful thing I can tell you from the other side of this process.

What the Pipeline Produces

The Vault of Ages catalog – cinematic, historically accurate illustration for ancient Roman gladiator culture – is available at HawkesAdventures.com under personal and commercial licenses. Commercial licenses permit derivative works: tabletop RPG supplements, game modules, interactive fiction, campaign settings, digital and print publications.

The pipeline that built it works for any historical period where accuracy matters. The next article in this series covers the infrastructure behind the catalog – the metadata architecture that turns a generation workflow into a structured, searchable, commercially viable image archive.

L. M. Hawkes writes cinematic, historically grounded interactive gamebooks drawing from the warrior traditions of Rome, Greece, Japan, the Viking Age, and the great battles of antiquity. The Vault of Ages Art Pack Configurator – a curated catalog of historically accurate cinematic illustration – is available at HawkesAdventures.com under personal and commercial licenses.

This is Part 4 of a 6-part series.

Previously, Part 3: The Three-Stage Correction Pipeline

Coming next week, Part 5: The Database Behind the Art – the metadata infrastructure that turns a generation workflow into a structured commercial archive.

Tags: Artificial Intelligence · Midjourney · Prompt Engineering · History · Ancient Rome · Game Design · Historical Fiction · Workflow

The Full Prompts

The Annotated Audit and Corrective Re-Prompting Architecture That Powers the Vault of Ages

Why Prompt Architecture Matters

The Audit Prompt – Annotated

The Corrective Re-Prompting Prompt – Annotated

Getting the Full Documents

What the Pipeline Produces

The Three-Stage Correction Pipeline

The Database Behind the Art

The Pipeline Works for Any Historical Period

The White Marble Lie

AI Keeps Putting Katanas in Ancient Rome

L. M. Hawkes

The Full Prompts

The Annotated Audit and Corrective Re-Prompting Architecture That Powers the Vault of Ages

Why Prompt Architecture Matters

The Audit Prompt – Annotated

The Corrective Re-Prompting Prompt – Annotated

Getting the Full Documents

What the Pipeline Produces

Similar Posts

L. M. Hawkes