tgame/docs/06-asset-pipeline.md
Parley Hatch 2abfe4abd1 Initial commit: design docs
Working title 'tgame' is provisional. Top-level samples/ and
docs/samples/ are gitignored; visual/art pipeline lives outside
this repo.
2026-05-17 11:16:07 -06:00

93 lines
5.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Asset Pipeline
AI lives in **production**, not in the runtime. The shipped game is fully self-contained, fully offline, no workstation dependency, no API calls, no surprise model regressions in prod.
## What gets pre-rendered
The asset library is **atomic components**, not finished items. Items get composed at runtime from layered sprites + shader passes — see [07-item-cards.md](07-item-cards.md). Many effects (glow, foil, holo, corruption, heat haze, frost, holy rays, first-reveal sweep) are pure shaders and need no PNG at all.
| Asset | Approach |
|---|---|
| Minion base sprites (face/hair/skin) | Layered sprite library |
| Class costume layers | Layered sprites by class + tier |
| Gear overlay sprites (per slot) | Layered sprite library, hundreds of variants |
| Item base silhouettes | ~3060 per item category (sword, dagger, vial, ring, etc.) |
| Material tints | ~20 per applicable material; applied via shader or pre-baked overlay |
| Component overlays | Pommel gems, hilt wraps, stoppers, labels, engravings — authored once, recombined |
| Effect auras | Per enchantment / effect type |
| Quality frames + flourishes | 5 quality bands + Legendary specials |
| Maker stamps / faction sigils | Per engineer + faction |
| Bespoke Legendary art | Hand-finished art for truly unique named items only |
| Ingredient & raw material sprites | One per ingredient |
| Patron portraits | One per patron + faction set dressing |
| Zone backdrops | One per zone, possibly day/night/weather variants |
| UI, icons, frames, particles | Standard authored assets |
| Audio (music + SFX) | Pre-bundled |
## How AI fits in
The workstation (RTX 6000 Ada) runs local diffusion + audio models during **production**. Two distinct concerns:
### Conversion pipeline (solved, scripted)
This is what produces consistent palette / resolution / "honest pixel art" output regardless of the source generator's style drift.
For static assets (items, portraits, backdrops, UI elements):
1. Generate high-res candidate (Flux, z-image, or comparable diffusion model)
2. Downsample to target pixel resolution
3. Apply Floyd-Steinberg dither against the master 256-color palette
4. Optional hand-clean of ambiguous pixels
5. Commit
For animated sprites (creatures, NPCs, characters in motion):
1. Generate the still character in Flux (locks the design)
2. Animate the still into a video clip with Wan 2.2 (or comparable)
3. Keyframe extract at 4 fps
4. Crop and convert each frame through the palette-quantize step
5. Commit as sprite sheet
This is fully programmatic — see `samples/` for a real output of this pipeline (the skeleton attack frame).
### Prompting discipline (ongoing, not a script)
Style consistency, character consistency, mood, composition, perspective, framing — these are *prompting* concerns, not pipeline concerns. The conversion pipeline cannot rescue a wrong-feeling prompt; it can only normalize color and resolution.
This means:
- Lock prompt templates per asset type (a "workshop interior backdrop" prompt template; an "enemy creature attack pose" template; a "minion bust portrait" template)
- Use LoRAs / IP-Adapter / character reference for character consistency across animations and gear updates
- Curate aggressively at generation time — bad source compositions stay bad after dithering
- Maintain a prompt library / prompt cookbook alongside the asset library
**The prompting discipline is the ongoing art direction work. The pipeline just guarantees the floor.**
### Audio
Local audio gen (musicgen or similar) for ambient beds and SFX, following the audio direction in [10-tone.md](10-tone.md). Same conceptual split: prompting/curation matters, downstream encoding/normalization is automatable.
No part of any pipeline runs on the player's device or talks to a server during play.
## Bundle strategy
Composition collapses asset volume substantially compared to authoring finished items.
- Rough order of magnitude (post-composition shift): ~60 item bases × ~10KB + ~20 material tints + ~150 component overlays × ~5KB + ~40 effect auras + ~300 minion portrait elements + 50 zones + UI/audio. Order of tens to low hundreds of MB rather than several hundred.
- Likely approach: **base bundle + on-demand asset packs per region/discipline** so first launch is fast and the player downloads more as they unlock.
- Texture atlasing for the sprite-heavy parts (component overlays, gear overlays, item bases).
- Bespoke Legendary art lives in its own pack, downloaded as Legendaries are first encountered.
## Minimum viable content for v1
Don't try to ship the full catalog at launch. A playable v1 is more like:
- 12 regions with ~5 zones total
- 3 stations (Forge, Alchemy Table, Loom — covers most fantasy crafting cliches)
- ~150 catalog items
- ~20 gear overlays per slot
- ~40 minion portrait permutations
- 12 patron arcs
Then expand by content patch.
## Open questions
- How much hand-finishing per AI-generated asset before it ships? (Quality bar matters more than volume.)
- Style consistency across thousands of assets — do we lock to one diffusion model + one LoRA stack and never change, or accept some drift?
- Audio: original score or licensed loops?