tgame/docs/15-audio.md
Parley Hatch 2abfe4abd1 Initial commit: design docs
Working title 'tgame' is provisional. Top-level samples/ and
docs/samples/ are gitignored; visual/art pipeline lives outside
this repo.
2026-05-17 11:16:07 -06:00

7.3 KiB

Audio

Audio is the weakest area of solo production capacity, so the plan here is pragmatic and refinable — a workable starting point that can ship and improve over time, not a finished design.

Direction

Music: Kelethin-style mellow fantasy ambient. Slow, modal, woodwind-forward with string accompaniment. Long compositions, lots of breathing room, gets out of the way. Reference set:

  • Everquest Kelethin / Greater Faydark / Felwithe — the canonical "wistful fantasy ambient" sound, slow flowing melodies that loop without feeling looped
  • Diablo II Tristram — solo acoustic guitar atmospheric, the gold-standard town-hub theme
  • Machinarium — minimal piano and wordless vocal, very atmospheric
  • Bastion — plucked strings, folk percussion, atmospheric
  • Planescape Torment / Baldur's Gate — orchestral fantasy, often subdued

SFX: clean, satisfying, era-appropriate. Hammer rings, page turns, glass clinks, soft chimes, low rumbles. Nothing cartoonish, nothing synthetic-sounding. Sample packs as the foundation, layered + tweaked + rotated so nothing becomes a Wilhelm scream.

Music production approach

  1. Generate MIDI compositions with AI tools (ACE / similar) using prompt parameters: mood = "mellow fantasy ambient," tempo = 60-80 BPM, instrumentation = woodwinds + strings + harp, length = 3-5 min, key/mode = Aeolian / Dorian / Phrygian for fantasy modal feel
  2. Curate the best
  3. Optional MuseScore pass for hand cleanup
  4. Render through a high-quality soundfont (FluidR3_GM, GeneralUser GS, or a specialty fantasy soundbank) via FluidSynth
  5. Export as OGG Vorbis at 96-128 kbps mono or low-bitrate stereo
  6. Keep MIDI source files committed alongside audio so re-instrumentation is possible later

Why MIDI-source over direct audio gen:

  • Tiny working files (10s of KB vs. multiple MB)
  • Procedurally controllable (transpose, tempo shift, re-instrument at runtime if you want)
  • Better quality results from AI tools than direct audio gen for instrumental music
  • Vanilla MIDI sounds dated, but a good soundfont makes it sound like real instruments

Anti-repetition strategies

The Kelethin trick is that the music never feels like it's looping at you. Apply these:

  • Long compositions — minimum 3 min per loop, ideally 5+
  • Multiple variants per location — 2-3 themes for each major surface, crossfade between them randomly
  • Time-of-day mixes — if Guild Hall has morning/evening backdrop variants, the music has matching mixes (lighter morning instrumentation, denser evening)
  • Silence is allowed — music doesn't play 100% of the time. 30-60 sec gaps with just ambient SFX (hearth crackle, wind) are restorative, not awkward
  • Same key signature across related themes — when crossfading between Guild Hall and Workshop themes, key continuity makes transitions feel natural

Music inventory (working set for v1)

Surface Theme
Guild Hall (home) 2-3 mellow themes, primary ambient — player hears these most
Forge Workshop variant, slightly more rhythmic to match hammer ambience
Alchemy Table Wispy, sustained chords, glass-bell accents
Library Quieter, sparser piano + cello
Patron Court Slightly more formal/regal but still mellow
Each region (~10) One zone theme per region, all in related modes for tonal coherence
Special stings Short (5-10 sec) musical moments for: rank up, Legendary craft, Patron completion

SFX production approach

Sourcing

  • Sonniss GDC bundles (free, ~30GB+/year) — covers most generic SFX needs
  • Targeted purchases from Sound Effect Studio, GameDev Market, Boom Library for specific gaps
  • Layer simple sounds for complexity — a "craft completion" can be hammer + chime + soft applause = unique signature, even though each component is a stock sound
  • Pitch/tempo shift in DAW to create variants — same source sound, six different timbres

Anti-Wilhelm discipline

  • Every frequently-played UI sound has 4-6 variants in a rotation pool
  • The audio engine picks one at random per trigger, never repeats consecutively
  • Workshop ambient loops are slow-evolving (no obvious 2-second loop point)

SFX inventory (working set for v1)

UI

  • Tap (3-4 variants — soft thunk, light click, small chime)
  • Swipe / tab change (2-3 variants)
  • Modal open / close
  • Notification chime (3 tonal variants — info / positive / warning)
  • Badge appears on Guild Hall scene (subtle ping)

Workshop ambience (looping under music)

  • Forge: bellows, hammer rings, occasional metal clang
  • Alchemy: bubbling, glass clink, drip
  • Loom: rhythmic click-shuttle
  • Carpenter: saw, plane, chisel
  • Library: page turn, quill scratch, candle flicker

Crafting (per 13-reveal-choreography.md)

  • Crude reveal: soft thump
  • Common reveal: hammer crunch
  • Fine reveal: hammer + metallic chime
  • Masterwork reveal: hum → rising chime → metallic clang → soft choir flourish
  • Legendary reveal: low rumble → slow rising swell → crack of light → choir + organ swell
  • Critical-success break: rising "wait — no" tone before upgrading
  • New Discovery: parchment whoosh + book shut

Expedition

  • Door open / close (Guild Hall front door)
  • Soft footsteps fade-out (departing party)
  • Soft footsteps fade-in (returning party)
  • Coin drop (loot received)
  • Map page flip (world map open)

Ambient (per backdrop)

  • Hearth crackle
  • Wind (soft, looping, with subtle variation)
  • Rain (for relevant zones)
  • Distant tavern murmur (for town-adjacent zones)
  • Insect chitter (for forest zones)
  • Cave drip (for subterranean zones)

Mobile audio engineering

  • Format: OGG Vorbis throughout (smaller than MP3, no licensing concerns, native support in every engine)
  • Music: 96-128 kbps, stereo or low-bitrate stereo
  • SFX: 64-96 kbps, mono (mobile speakers are mono; saves space)
  • Master normalization: all assets normalized to a consistent loudness target so nothing pops at full volume
  • Bus mixing: separate buses for Music / Ambient / SFX / UI with independent volume sliders in settings (accessibility requirement, also good for player preference)
  • No spatial audio needed — 2D card game, no 3D positioning required. Slight stereo positioning of workshop sounds (forge slightly left, alchemy slightly right) is a nice touch but optional.

Accessibility

  • Independent volume sliders per bus (Music / Ambient / SFX / UI)
  • "Reduce ambient loops" setting (some players hate constant hammers / bubbling under everything)
  • Music-off / SFX-only mode
  • Visual feedback for all important audio cues (the reveal choreography is already visually complete — audio enhances, never replaces)

Open questions

  • Which AI MIDI tool ends up best for this style — ACE, MuseNet-style, or something newer
  • Soundfont selection — do we ship with FluidR3_GM (free, decent) or invest in a fantasy-specific soundbank
  • How many distinct music tracks for v1 — probably 8-12 minimum (Hall + a few workshops + ~3 zone themes + stings)
  • Total audio budget on disk — likely 30-80 MB for music + SFX combined, plenty for mobile
  • Voice acting — none planned (matches our "no LLM in critical path" discipline and saves significant scope), but worth confirming
  • Haptic feedback on mobile — paired with audio cues (vibration patterns per quality band) — meaningful enhancement or unnecessary?