Procedural BGM and SFX with the Web Audio API — No Sample Files
A hands-on guide to building game BGM and SFX purely with the Web Audio API — from basic Oscillator / Envelope / Filter chains to traps like iOS Safari unlock.
Why Not External Samples?
When I first decided to add sound to the canvas games, the obvious choice was .mp3 or .wav files. I went a different direction for a few reasons.
- Bundle size: a single short SFX is hundreds of KB. Ten of them hits megabytes quickly.
- Licensing: even CC0 free resources need attribution tracking.
- Variation: a sample plays exactly as recorded. It's hard to give subtly different feel per situation.
- Consistent tone: samples collected from different sources vary wildly in timbre and volume.
So I chose to synthesize everything with the Web Audio API. The result: a site with zero audio files in its bundle.
This post captures the practical patterns I accumulated along the way.
Web Audio Basics — Three Nodes Is Enough
Surprisingly, most of complex synthesis comes down to three nodes.
| Node | Role |
|---|---|
OscillatorNode | Generate waveforms by frequency (sine / square / sawtooth / triangle) |
GainNode | Volume control + time-domain control (envelope) |
BiquadFilterNode | Frequency filter (lowpass / highpass / bandpass / peaking) |
Wire them: Oscillator → Filter → Gain → destination.
A simple beep:
const ctx = new AudioContext();
const osc = ctx.createOscillator();
const gain = ctx.createGain();
osc.frequency.value = 440; // A4
osc.type = 'sine';
osc.connect(gain);
gain.connect(ctx.destination);
osc.start();
osc.stop(ctx.currentTime + 0.2); // stop after 200ms
Envelope — Shaping Sound Smoothly
Just calling start/stop produces a harsh click. Modulate GainNode's volume over time to implement ADSR (Attack-Decay-Sustain-Release) for smoothness.
const now = ctx.currentTime;
gain.gain.setValueAtTime(0, now);
gain.gain.linearRampToValueAtTime(0.3, now + 0.02); // Attack 20ms
gain.gain.linearRampToValueAtTime(0.15, now + 0.05); // Decay
gain.gain.setValueAtTime(0.15, now + 0.2); // Sustain
gain.gain.linearRampToValueAtTime(0, now + 0.3); // Release
That alone turns a harsh "beep" into a musical "ding".
⚠️ Trap: Consecutive
linearRampcalls can jump from the previous ramp's endpoint. Always reset before scheduling new ramps:cancelScheduledValues(now); setValueAtTime(currentValue, now).
Real Layering — Building Engine Rumble
Thick sounds like a rocket engine are made by layering multiple voices. The Rocket Workshop engine sound in this project uses a 4-layer structure.
| Layer | Source | Filter |
|---|---|---|
| Main roar | White noise | LPF 200Hz + peaking 50Hz |
| Exhaust crackle | White noise | BPF 450Hz |
| Sub-bass | Sine 20Hz | — |
| Pulse | Triangle 35Hz | — |
Each layer mixes through its own GainNode and merges at a master bus. Reacting the layer gains differently to throttle value creates a realistic sense of engine output.
Separating BGM and SFX
An early mistake. I initially wired everything into one master gain, then came the requirement: "the mute button should only silence BGM". Separation became necessary.
// Two-bus structure
let _bgmGain: GainNode | null = null;
let _sfxGain: GainNode | null = null;
function init(ctx: AudioContext) {
_bgmGain = ctx.createGain();
_sfxGain = ctx.createGain();
_bgmGain.connect(ctx.destination);
_sfxGain.connect(ctx.destination);
}
// Mute button adjusts BGM only
setBGMVolume(muted ? 0 : 1);
// SFX always plays
Game-feedback SFX feels broken in silence even when "muted" — a pattern also seen in other games' reference behavior.
The Module-Singleton Pattern
For safe init/cleanup across components, a module-level singleton is convenient.
// audio/engineRumble.ts
let _ctx: AudioContext | null = null;
let _timer: ReturnType<typeof setTimeout> | null = null;
let _active = false;
export function startEngineRumble() {
if (_active) return;
_ctx = new AudioContext();
_active = true;
// ... build node graph
}
export function stopEngineRumble() {
if (!_active) return;
// ... clean up
_ctx = null;
_active = false;
}
Expose the start/stop pair; keep internal state in module scope. Context lifecycle timing stays clear.
iOS Safari AudioContext Unlock
iOS Safari only allows audio playback when resume() is called directly inside a user gesture callback. Chains like onClick → setState → useEffect → start run after React's batched flush, which severs the gesture context.
Fix pattern: inject a dummy context unlock at the start of the onClick handler.
onClick={(e) => {
// iOS Safari: dummy unlock activates the user-gesture counter
try {
const AC = window.AudioContext || (window as any).webkitAudioContext;
if (AC) {
const u = new AC();
void u.resume().finally(() => u.close());
}
} catch {}
// Your normal start logic
setState('playing');
}}
The dummy ctx isn't actually used, but the call itself activates iOS's user-gesture counter, letting the real AudioContext resume afterward.
Node Cleanup — stop + disconnect + null
Just calling osc.stop() leaves graph references alive — GC won't reclaim them. Over time, memory leaks.
function cleanup() {
// 1. Stop
_osc?.stop();
_gain?.gain.cancelScheduledValues(_ctx!.currentTime);
// 2. Disconnect
_osc?.disconnect();
_gain?.disconnect();
// 3. Null references
_osc = null;
_gain = null;
// 4. Close context too
_ctx?.close();
_ctx = null;
}
All four steps matter. stop alone is half done.
Cave Reverb Routing — An Unexpected Volume Explosion
When adding a "cave-like" spatial feel via a simple DelayNode reverb, I initially made this mistake:
// ❌ Wrong — connecting raw oscillator directly to delay
osc.connect(caveDelay);
Symptom: the volume exploded and osc.stop() produced a loud click.
Cause: a raw oscillator without an envelope runs at amplitude ~1.0. When stop cuts the waveform abruptly, the discontinuity plays back as a click.
// ✅ Fix — always route through an envelope
osc.connect(env);
env.connect(caveDelay);
Rule: never wire a raw oscillator directly into Delay / Filter / Reverb. Always insert a
Gain(envelope)in between.
Adoption Checklist
If you're starting with Web Audio synthesis:
- Understand the Oscillator + Filter + Gain three-node structure
- Never
start/stopwithout an envelope (ADSR) — clicks guaranteed - Before consecutive
linearRampToValueAtTimecalls, reset withcancelScheduledValues - Separate BGM / SFX buses
- Module singleton exposing paired
start/stop - iOS Safari dummy-unlock trick
- Cleanup = stop + disconnect + null + close (4 steps)
- Raw oscillators must never connect directly to filter / reverb
Retrospective
Web Audio synthesis feels like "why do it this way?" when you start, but once it clicks, there's a freedom that you can't go back from samples for. Engine tone shifts in real time with the throttle, each situation gets a subtly different feedback SFX, and you never add a byte to the bundle.
Deep music theory isn't needed. A sense of "higher frequency = brighter, lower = heavier" is enough for most game SFX. The rest is approximating a desired feel with oscillator combinations.
If you're tired of managing external samples or worried about bundle size, it's worth dipping a toe into synthesis at least once.
Guestbook
Leave a short note about this post
Loading...