Working title 'tgame' is provisional. Top-level samples/ and docs/samples/ are gitignored; visual/art pipeline lives outside this repo.
93 lines
5.1 KiB
Markdown
93 lines
5.1 KiB
Markdown
# Asset Pipeline
|
||
|
||
AI lives in **production**, not in the runtime. The shipped game is fully self-contained, fully offline, no workstation dependency, no API calls, no surprise model regressions in prod.
|
||
|
||
## What gets pre-rendered
|
||
|
||
The asset library is **atomic components**, not finished items. Items get composed at runtime from layered sprites + shader passes — see [07-item-cards.md](07-item-cards.md). Many effects (glow, foil, holo, corruption, heat haze, frost, holy rays, first-reveal sweep) are pure shaders and need no PNG at all.
|
||
|
||
| Asset | Approach |
|
||
|---|---|
|
||
| Minion base sprites (face/hair/skin) | Layered sprite library |
|
||
| Class costume layers | Layered sprites by class + tier |
|
||
| Gear overlay sprites (per slot) | Layered sprite library, hundreds of variants |
|
||
| Item base silhouettes | ~30–60 per item category (sword, dagger, vial, ring, etc.) |
|
||
| Material tints | ~20 per applicable material; applied via shader or pre-baked overlay |
|
||
| Component overlays | Pommel gems, hilt wraps, stoppers, labels, engravings — authored once, recombined |
|
||
| Effect auras | Per enchantment / effect type |
|
||
| Quality frames + flourishes | 5 quality bands + Legendary specials |
|
||
| Maker stamps / faction sigils | Per engineer + faction |
|
||
| Bespoke Legendary art | Hand-finished art for truly unique named items only |
|
||
| Ingredient & raw material sprites | One per ingredient |
|
||
| Patron portraits | One per patron + faction set dressing |
|
||
| Zone backdrops | One per zone, possibly day/night/weather variants |
|
||
| UI, icons, frames, particles | Standard authored assets |
|
||
| Audio (music + SFX) | Pre-bundled |
|
||
|
||
## How AI fits in
|
||
|
||
The workstation (RTX 6000 Ada) runs local diffusion + audio models during **production**. Two distinct concerns:
|
||
|
||
### Conversion pipeline (solved, scripted)
|
||
|
||
This is what produces consistent palette / resolution / "honest pixel art" output regardless of the source generator's style drift.
|
||
|
||
For static assets (items, portraits, backdrops, UI elements):
|
||
1. Generate high-res candidate (Flux, z-image, or comparable diffusion model)
|
||
2. Downsample to target pixel resolution
|
||
3. Apply Floyd-Steinberg dither against the master 256-color palette
|
||
4. Optional hand-clean of ambiguous pixels
|
||
5. Commit
|
||
|
||
For animated sprites (creatures, NPCs, characters in motion):
|
||
1. Generate the still character in Flux (locks the design)
|
||
2. Animate the still into a video clip with Wan 2.2 (or comparable)
|
||
3. Keyframe extract at 4 fps
|
||
4. Crop and convert each frame through the palette-quantize step
|
||
5. Commit as sprite sheet
|
||
|
||
This is fully programmatic — see `samples/` for a real output of this pipeline (the skeleton attack frame).
|
||
|
||
### Prompting discipline (ongoing, not a script)
|
||
|
||
Style consistency, character consistency, mood, composition, perspective, framing — these are *prompting* concerns, not pipeline concerns. The conversion pipeline cannot rescue a wrong-feeling prompt; it can only normalize color and resolution.
|
||
|
||
This means:
|
||
- Lock prompt templates per asset type (a "workshop interior backdrop" prompt template; an "enemy creature attack pose" template; a "minion bust portrait" template)
|
||
- Use LoRAs / IP-Adapter / character reference for character consistency across animations and gear updates
|
||
- Curate aggressively at generation time — bad source compositions stay bad after dithering
|
||
- Maintain a prompt library / prompt cookbook alongside the asset library
|
||
|
||
**The prompting discipline is the ongoing art direction work. The pipeline just guarantees the floor.**
|
||
|
||
### Audio
|
||
|
||
Local audio gen (musicgen or similar) for ambient beds and SFX, following the audio direction in [10-tone.md](10-tone.md). Same conceptual split: prompting/curation matters, downstream encoding/normalization is automatable.
|
||
|
||
No part of any pipeline runs on the player's device or talks to a server during play.
|
||
|
||
## Bundle strategy
|
||
|
||
Composition collapses asset volume substantially compared to authoring finished items.
|
||
- Rough order of magnitude (post-composition shift): ~60 item bases × ~10KB + ~20 material tints + ~150 component overlays × ~5KB + ~40 effect auras + ~300 minion portrait elements + 50 zones + UI/audio. Order of tens to low hundreds of MB rather than several hundred.
|
||
- Likely approach: **base bundle + on-demand asset packs per region/discipline** so first launch is fast and the player downloads more as they unlock.
|
||
- Texture atlasing for the sprite-heavy parts (component overlays, gear overlays, item bases).
|
||
- Bespoke Legendary art lives in its own pack, downloaded as Legendaries are first encountered.
|
||
|
||
## Minimum viable content for v1
|
||
|
||
Don't try to ship the full catalog at launch. A playable v1 is more like:
|
||
- 1–2 regions with ~5 zones total
|
||
- 3 stations (Forge, Alchemy Table, Loom — covers most fantasy crafting cliches)
|
||
- ~150 catalog items
|
||
- ~20 gear overlays per slot
|
||
- ~40 minion portrait permutations
|
||
- 1–2 patron arcs
|
||
|
||
Then expand by content patch.
|
||
|
||
## Open questions
|
||
|
||
- How much hand-finishing per AI-generated asset before it ships? (Quality bar matters more than volume.)
|
||
- Style consistency across thousands of assets — do we lock to one diffusion model + one LoRA stack and never change, or accept some drift?
|
||
- Audio: original score or licensed loops?
|