# Asset Pipeline AI lives in **production**, not in the runtime. The shipped game is fully self-contained, fully offline, no workstation dependency, no API calls, no surprise model regressions in prod. ## What gets pre-rendered The asset library is **atomic components**, not finished items. Items get composed at runtime from layered sprites + shader passes — see [07-item-cards.md](07-item-cards.md). Many effects (glow, foil, holo, corruption, heat haze, frost, holy rays, first-reveal sweep) are pure shaders and need no PNG at all. | Asset | Approach | |---|---| | Minion base sprites (face/hair/skin) | Layered sprite library | | Class costume layers | Layered sprites by class + tier | | Gear overlay sprites (per slot) | Layered sprite library, hundreds of variants | | Item base silhouettes | ~30–60 per item category (sword, dagger, vial, ring, etc.) | | Material tints | ~20 per applicable material; applied via shader or pre-baked overlay | | Component overlays | Pommel gems, hilt wraps, stoppers, labels, engravings — authored once, recombined | | Effect auras | Per enchantment / effect type | | Quality frames + flourishes | 5 quality bands + Legendary specials | | Maker stamps / faction sigils | Per engineer + faction | | Bespoke Legendary art | Hand-finished art for truly unique named items only | | Ingredient & raw material sprites | One per ingredient | | Patron portraits | One per patron + faction set dressing | | Zone backdrops | One per zone, possibly day/night/weather variants | | UI, icons, frames, particles | Standard authored assets | | Audio (music + SFX) | Pre-bundled | ## How AI fits in The workstation (RTX 6000 Ada) runs local diffusion + audio models during **production**. Two distinct concerns: ### Conversion pipeline (solved, scripted) This is what produces consistent palette / resolution / "honest pixel art" output regardless of the source generator's style drift. For static assets (items, portraits, backdrops, UI elements): 1. Generate high-res candidate (Flux, z-image, or comparable diffusion model) 2. Downsample to target pixel resolution 3. Apply Floyd-Steinberg dither against the master 256-color palette 4. Optional hand-clean of ambiguous pixels 5. Commit For animated sprites (creatures, NPCs, characters in motion): 1. Generate the still character in Flux (locks the design) 2. Animate the still into a video clip with Wan 2.2 (or comparable) 3. Keyframe extract at 4 fps 4. Crop and convert each frame through the palette-quantize step 5. Commit as sprite sheet This is fully programmatic — see `samples/` for a real output of this pipeline (the skeleton attack frame). ### Prompting discipline (ongoing, not a script) Style consistency, character consistency, mood, composition, perspective, framing — these are *prompting* concerns, not pipeline concerns. The conversion pipeline cannot rescue a wrong-feeling prompt; it can only normalize color and resolution. This means: - Lock prompt templates per asset type (a "workshop interior backdrop" prompt template; an "enemy creature attack pose" template; a "minion bust portrait" template) - Use LoRAs / IP-Adapter / character reference for character consistency across animations and gear updates - Curate aggressively at generation time — bad source compositions stay bad after dithering - Maintain a prompt library / prompt cookbook alongside the asset library **The prompting discipline is the ongoing art direction work. The pipeline just guarantees the floor.** ### Audio Local audio gen (musicgen or similar) for ambient beds and SFX, following the audio direction in [10-tone.md](10-tone.md). Same conceptual split: prompting/curation matters, downstream encoding/normalization is automatable. No part of any pipeline runs on the player's device or talks to a server during play. ## Bundle strategy Composition collapses asset volume substantially compared to authoring finished items. - Rough order of magnitude (post-composition shift): ~60 item bases × ~10KB + ~20 material tints + ~150 component overlays × ~5KB + ~40 effect auras + ~300 minion portrait elements + 50 zones + UI/audio. Order of tens to low hundreds of MB rather than several hundred. - Likely approach: **base bundle + on-demand asset packs per region/discipline** so first launch is fast and the player downloads more as they unlock. - Texture atlasing for the sprite-heavy parts (component overlays, gear overlays, item bases). - Bespoke Legendary art lives in its own pack, downloaded as Legendaries are first encountered. ## Minimum viable content for v1 Don't try to ship the full catalog at launch. A playable v1 is more like: - 1–2 regions with ~5 zones total - 3 stations (Forge, Alchemy Table, Loom — covers most fantasy crafting cliches) - ~150 catalog items - ~20 gear overlays per slot - ~40 minion portrait permutations - 1–2 patron arcs Then expand by content patch. ## Open questions - How much hand-finishing per AI-generated asset before it ships? (Quality bar matters more than volume.) - Style consistency across thousands of assets — do we lock to one diffusion model + one LoRA stack and never change, or accept some drift? - Audio: original score or licensed loops?