# Audio Audio is the weakest area of solo production capacity, so the plan here is **pragmatic and refinable** — a workable starting point that can ship and improve over time, not a finished design. ## Direction **Music: Kelethin-style mellow fantasy ambient.** Slow, modal, woodwind-forward with string accompaniment. Long compositions, lots of breathing room, gets out of the way. Reference set: - **Everquest Kelethin / Greater Faydark / Felwithe** — the canonical "wistful fantasy ambient" sound, slow flowing melodies that loop without feeling looped - **Diablo II Tristram** — solo acoustic guitar atmospheric, the gold-standard town-hub theme - **Machinarium** — minimal piano and wordless vocal, very atmospheric - **Bastion** — plucked strings, folk percussion, atmospheric - **Planescape Torment / Baldur's Gate** — orchestral fantasy, often subdued **SFX: clean, satisfying, era-appropriate.** Hammer rings, page turns, glass clinks, soft chimes, low rumbles. Nothing cartoonish, nothing synthetic-sounding. Sample packs as the foundation, layered + tweaked + rotated so nothing becomes a Wilhelm scream. ## Music production approach ### MIDI generation pipeline (recommended) 1. Generate MIDI compositions with AI tools (ACE / similar) using prompt parameters: mood = "mellow fantasy ambient," tempo = 60-80 BPM, instrumentation = woodwinds + strings + harp, length = 3-5 min, key/mode = Aeolian / Dorian / Phrygian for fantasy modal feel 2. Curate the best 3. Optional MuseScore pass for hand cleanup 4. Render through a high-quality soundfont (FluidR3_GM, GeneralUser GS, or a specialty fantasy soundbank) via FluidSynth 5. Export as OGG Vorbis at 96-128 kbps mono or low-bitrate stereo 6. Keep MIDI source files committed alongside audio so re-instrumentation is possible later **Why MIDI-source over direct audio gen:** - Tiny working files (10s of KB vs. multiple MB) - Procedurally controllable (transpose, tempo shift, re-instrument at runtime if you want) - Better quality results from AI tools than direct audio gen for instrumental music - Vanilla MIDI sounds dated, but a good soundfont makes it sound like real instruments ### Anti-repetition strategies The Kelethin trick is that the music never feels like it's looping at you. Apply these: - **Long compositions** — minimum 3 min per loop, ideally 5+ - **Multiple variants per location** — 2-3 themes for each major surface, crossfade between them randomly - **Time-of-day mixes** — if Guild Hall has morning/evening backdrop variants, the music has matching mixes (lighter morning instrumentation, denser evening) - **Silence is allowed** — music doesn't play 100% of the time. 30-60 sec gaps with just ambient SFX (hearth crackle, wind) are restorative, not awkward - **Same key signature across related themes** — when crossfading between Guild Hall and Workshop themes, key continuity makes transitions feel natural ## Music inventory (working set for v1) | Surface | Theme | |---|---| | Guild Hall (home) | 2-3 mellow themes, primary ambient — player hears these most | | Forge | Workshop variant, slightly more rhythmic to match hammer ambience | | Alchemy Table | Wispy, sustained chords, glass-bell accents | | Library | Quieter, sparser piano + cello | | Patron Court | Slightly more formal/regal but still mellow | | Each region (~10) | One zone theme per region, all in related modes for tonal coherence | | Special stings | Short (5-10 sec) musical moments for: rank up, Legendary craft, Patron completion | ## SFX production approach ### Sourcing - **Sonniss GDC bundles** (free, ~30GB+/year) — covers most generic SFX needs - Targeted purchases from Sound Effect Studio, GameDev Market, Boom Library for specific gaps - **Layer simple sounds for complexity** — a "craft completion" can be hammer + chime + soft applause = unique signature, even though each component is a stock sound - **Pitch/tempo shift** in DAW to create variants — same source sound, six different timbres ### Anti-Wilhelm discipline - Every frequently-played UI sound has **4-6 variants** in a rotation pool - The audio engine picks one at random per trigger, never repeats consecutively - Workshop ambient loops are slow-evolving (no obvious 2-second loop point) ## SFX inventory (working set for v1) ### UI - Tap (3-4 variants — soft thunk, light click, small chime) - Swipe / tab change (2-3 variants) - Modal open / close - Notification chime (3 tonal variants — info / positive / warning) - Badge appears on Guild Hall scene (subtle ping) ### Workshop ambience (looping under music) - Forge: bellows, hammer rings, occasional metal clang - Alchemy: bubbling, glass clink, drip - Loom: rhythmic click-shuttle - Carpenter: saw, plane, chisel - Library: page turn, quill scratch, candle flicker ### Crafting (per [13-reveal-choreography.md](13-reveal-choreography.md)) - Crude reveal: soft thump - Common reveal: hammer crunch - Fine reveal: hammer + metallic chime - Masterwork reveal: hum → rising chime → metallic clang → soft choir flourish - Legendary reveal: low rumble → slow rising swell → *crack* of light → choir + organ swell - Critical-success break: rising "wait — no" tone before upgrading - New Discovery: parchment whoosh + book shut ### Expedition - Door open / close (Guild Hall front door) - Soft footsteps fade-out (departing party) - Soft footsteps fade-in (returning party) - Coin drop (loot received) - Map page flip (world map open) ### Ambient (per backdrop) - Hearth crackle - Wind (soft, looping, with subtle variation) - Rain (for relevant zones) - Distant tavern murmur (for town-adjacent zones) - Insect chitter (for forest zones) - Cave drip (for subterranean zones) ## Mobile audio engineering - **Format**: OGG Vorbis throughout (smaller than MP3, no licensing concerns, native support in every engine) - **Music**: 96-128 kbps, stereo or low-bitrate stereo - **SFX**: 64-96 kbps, mono (mobile speakers are mono; saves space) - **Master normalization**: all assets normalized to a consistent loudness target so nothing pops at full volume - **Bus mixing**: separate buses for Music / Ambient / SFX / UI with independent volume sliders in settings (accessibility requirement, also good for player preference) - **No spatial audio needed** — 2D card game, no 3D positioning required. Slight stereo positioning of workshop sounds (forge slightly left, alchemy slightly right) is a nice touch but optional. ## Accessibility - Independent volume sliders per bus (Music / Ambient / SFX / UI) - "Reduce ambient loops" setting (some players hate constant hammers / bubbling under everything) - Music-off / SFX-only mode - Visual feedback for all important audio cues (the reveal choreography is already visually complete — audio enhances, never replaces) ## Open questions - Which AI MIDI tool ends up best for this style — ACE, MuseNet-style, or something newer - Soundfont selection — do we ship with FluidR3_GM (free, decent) or invest in a fantasy-specific soundbank - How many distinct music tracks for v1 — probably 8-12 minimum (Hall + a few workshops + ~3 zone themes + stings) - Total audio budget on disk — likely 30-80 MB for music + SFX combined, plenty for mobile - Voice acting — none planned (matches our "no LLM in critical path" discipline and saves significant scope), but worth confirming - Haptic feedback on mobile — paired with audio cues (vibration patterns per quality band) — meaningful enhancement or unnecessary?