name	description
demoscene-webgl-demo	Build audio-synchronized visual demos in a single HTML file using WebGL2/GLSL raymarching. Covers SDF-based 3D scenes, multi-pass rendering with bloom, phase-based timeline choreography, and real-time audio-reactive visuals via Web Audio API. Use when asked to create demoscene demos, WebGL raymarching, audio-reactive graphics, music visualizers, or choreographed visual experiences.

Audio-Synced WebGL Demo Skill

Architecture

Single index.html containing all shaders, WebGL setup, audio integration, and overlays. Self-contained by design — no build step, no dependencies.

Multi-Pass Rendering Pipeline

Scene pass — Raymarching with SDFs, lighting, volumetrics, materials
Bright extract — Threshold filter isolating bloom-worthy pixels
Gaussian blur — Two-pass separable blur on the bright extract
Composite — Combines scene + bloom, applies post-effects (glitch, fade, color grading)

Each pass renders to its own framebuffer/texture. Final composite goes to screen.

Computing Choreography from Audio

This is the most important part of making a demo. The process turns any audio file into a precise choreography map — timestamps, phases, energy curves — before writing any shader code. Bad choreography makes a technically impressive demo feel lifeless.

Step 1: Analyze the Audio File

Before anything else, build a complete picture of the track's energy and structure.

Option A: ffmpeg (fast, works on any machine)

# Get track duration
ffprobe -v error -show_entries format=duration -of csv=p=0 track.wav

# Generate a waveform image — the single most useful visual reference
# Width of 3000px means ~1px per 100ms for a 5min track
ffmpeg -i track.wav -filter_complex "showwavespic=s=3000x200:colors=white" -frames:v 1 waveform.png

# Generate RMS energy over time (one value per audio frame)
ffmpeg -i track.wav -af astats=metadata=1:reset=1,ametadata=print:key=lavfi.astats.Overall.RMS_level -f null - 2>&1 | grep RMS_level > energy.txt

# Generate a spectrogram — shows frequency content over time
# Useful for distinguishing builds (rising freq) from drops (full spectrum)
ffmpeg -i track.wav -lavfi showspectrumpic=s=3000x400 spectrogram.png

Open waveform.png and spectrogram.png side by side. The waveform shows you when energy changes. The spectrogram shows you what kind of energy changes (bass drop vs. hi-hat entrance vs. pad swell).

Option B: Web Audio OfflineAudioContext (programmatic)

async function analyzeTrack(file) {
  const arrayBuffer = await file.arrayBuffer();
  const audioCtx = new AudioContext();
  const buffer = await audioCtx.decodeAudioData(arrayBuffer);
  const samples = buffer.getChannelData(0); // mono or left channel
  const sampleRate = buffer.sampleRate;

  const hopSize = Math.floor(sampleRate * 0.1); // 100ms windows
  const events = [];

  for (let i = 0; i < samples.length; i += hopSize) {
    const end = Math.min(i + hopSize, samples.length);
    let rms = 0, peak = 0, zeroCrossings = 0;

    for (let j = i; j < end; j++) {
      const s = Math.abs(samples[j]);
      rms += samples[j] * samples[j];
      if (s > peak) peak = s;
      if (j > i && (samples[j] >= 0) !== (samples[j-1] >= 0)) zeroCrossings++;
    }

    rms = Math.sqrt(rms / (end - i));
    const brightness = zeroCrossings / (end - i); // proxy for spectral centroid

    events.push({
      time: i / sampleRate,
      rms,           // overall loudness
      peak,          // transient detection
      brightness     // high = hi-hats/cymbals, low = bass/pads
    });
  }

  audioCtx.close();
  return { events, duration: buffer.duration, sampleRate };
}

Option C: Python with librosa (most precise)

pip install librosa numpy

import librosa
import numpy as np

y, sr = librosa.load('track.wav', sr=22050, mono=True)
duration = librosa.get_duration(y=y, sr=sr)

# Onset detection — finds percussive hits
onset_frames = librosa.onset.onset_detect(y=y, sr=sr, units='frames')
onset_times = librosa.frames_to_time(onset_frames, sr=sr)

# Beat tracking
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
beat_times = librosa.frames_to_time(beat_frames, sr=sr)

# RMS energy over time
rms = librosa.feature.rms(y=y, frame_length=2048, hop_length=512)[0]
rms_times = librosa.frames_to_time(range(len(rms)), sr=sr, hop_length=512)

# Spectral centroid (brightness)
centroid = librosa.feature.spectral_centroid(y=y, sr=sr, hop_length=512)[0]

# Chromagram (harmonic content — useful for detecting key changes)
chroma = librosa.feature.chroma_stft(y=y, sr=sr, hop_length=512)

print(f"Duration: {duration:.1f}s")
print(f"BPM: {tempo:.1f}")
print(f"Beats: {len(beat_times)}")
print(f"Onsets: {len(onset_times)}")
print(f"\nFirst 20 onset times:")
for t in onset_times[:20]:
    print(f"  {t:.2f}s")

Step 2: Identify Structural Sections

From the analysis, segment the track into sections. Look for these patterns in the waveform/energy data:

Pattern in waveform	What it means	Musical term
Near-zero amplitude → gradual rise	Energy building	Intro / Build
Sudden jump from low to high	Energy release	Drop
Sustained high amplitude	Full arrangement	Chorus / Main section
Brief dip in sustained energy	Tension/release cycle	Breakdown
High → gradually decreasing	Energy winding down	Outro / Fade
Texture change (spectrogram shift)	Instrument swap	Transition
Isolated spike in low energy	Percussive accent	Hit
Flat near-silence	Space/rest	Void / Silence

For each section boundary, note the exact timestamp to the nearest 0.1s. These become your smoothstep parameters.

Step 3: Build the Choreography Map

Create a plain-text document — this is the single source of truth for all visual timing. Write it before any shader code.

CHOREOGRAPHY MAP — track: [filename] ([duration]s, [BPM] BPM)
═══════════════════════════════════════════════════════════════

SECTIONS:
  [start] - [end]   [NAME]     — [description of what happens visually]
  [start] - [end]   [NAME]     — [description]
  ...

KEY HITS (one-shot events):
  [time]  — [what happens in the music] → [visual response]
  [time]  — [what happens] → [visual response]
  ...

ENERGY ARC:
  [prose description of overall energy shape, e.g.
   "slow build → explosive drop → sustained intensity → brief calm →
    second build → climactic transformation → fade"]

MOOD TRANSITIONS:
  [time]: [mood A] → [mood B]  (e.g. "warm/organic → cold/digital")
  ...

Rules for a good map:

Every section needs a distinct visual identity (if two sections look the same, merge them)
Drops must have a visual event — a drop with no visual impact is a wasted moment
Builds must have visible escalation — if nothing changes over 20s, the audience loses interest
Quiet sections must actually be quiet visually — contrast makes loud sections louder
The map should read like a story arc, not a flat list

Step 4: Compute BPM and Beat Grid

If the track has a steady beat, snapping phase boundaries to bar lines makes choreography feel musical rather than arbitrary.

function detectBPM(events, minBPM = 60, maxBPM = 180) {
  const energies = events.map(e => e.rms);
  const dt = events[1].time - events[0].time;
  const minLag = Math.floor(60 / (maxBPM * dt));
  const maxLag = Math.floor(60 / (minBPM * dt));

  let bestLag = minLag, bestCorr = -1;
  for (let lag = minLag; lag <= maxLag; lag++) {
    let corr = 0;
    for (let i = 0; i < energies.length - lag; i++) {
      corr += energies[i] * energies[i + lag];
    }
    if (corr > bestCorr) { bestCorr = corr; bestLag = lag; }
  }
  return 60 / (bestLag * dt);
}

With BPM known, compute the bar grid:

BPM: 128 → 1 beat = 0.469s, 1 bar (4/4) = 1.875s
BPM: 80  → 1 beat = 0.750s, 1 bar (4/4) = 3.000s
BPM: 140 → 1 beat = 0.429s, 1 bar (4/4) = 1.714s

Common section lengths:
  4 bars  = intro/outro, breakdown
  8 bars  = verse, build
  16 bars = chorus, main section
  32 bars = extended section

Snap your section boundaries to the nearest bar line. Drops almost always land on beat 1 of a bar. Phase transitions sound best on bar boundaries.

Step 5: Bake Energy Envelope (Optional, for Tight Sync)

For demos where choreography must perfectly track the audio energy — beyond what real-time FFT can provide — bake the energy envelope into a 1D texture:

const energyData = new Uint8Array(events.map(e => Math.min(255, e.rms * 512)));
const energyTex = gl.createTexture();
gl.bindTexture(gl.TEXTURE_2D, energyTex);
gl.texImage2D(gl.TEXTURE_2D, 0, gl.R8, energyData.length, 1, 0,
              gl.RED, gl.UNSIGNED_BYTE, energyData);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.LINEAR);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);

Sample in the shader:

uniform sampler2D energyTex;
uniform float trackDuration;
float bakedEnergy = texture(energyTex, vec2(t / trackDuration, 0.5)).r;

This is deterministic and frame-rate independent — no FFT latency, no device variance. Use it for geometry and camera (must be rock-solid). Layer real-time FFT on top for materials (benefits from organic variance).

You can bake multiple channels — pack bass/mid/treble/overall into RGBA for a full 4-band deterministic energy texture.

Step 6: Translate Map to Phase Code

Each section in the choreography map becomes a phase field. Each phase is a smoothstep that ramps 0→1 over the transition window:

// Pattern: ramp in, hold, ramp out
ph.sectionName = smoothstep(startTime, startTime + fadeIn, t)
               * (1.0 - smoothstep(endTime - fadeOut, endTime, t));

The fadeIn / fadeOut durations control transition sharpness:

0.1s — near-instant snap (drops, impacts)
0.5-1.0s — quick but smooth (normal transitions)
2.0-5.0s — gradual blend (mood shifts, slow builds)
10.0-30.0s — glacial evolution (entire build sections)

Each key hit becomes an exp() pulse:

float hit = exp(-abs(t - hitTime) * sharpness);
// sharpness 4-6: wide pulse, ~0.5s visible
// sharpness 8-12: tight spike, ~0.2s visible
// sharpness 15-20: near-instant flash

Step 7: Map Sections to Visual Strategies

Each section type has proven visual approaches:

Section	Camera	Geometry	Materials	Post-processing
Void/Intro	Static or very slow drift	Minimal or absent	Dark, monochrome	Clean, maybe subtle fog
Build	Slow approach	Growing, emerging	Warming colors	Increasing bloom
Drop	Snap to new angle	Burst/expansion	Bright flash, saturated	Bloom spike, screen shake
Main/Chorus	Smooth orbit	Full scene visible	Rich, layered	Moderate bloom
Breakdown	Drift, lose focus	Fracture, glitch	Desaturated	UV displacement, scanlines
Storm/Peak	Fast movement, close	Maximum complexity	Hot, intense	Heavy bloom, shake
Transition	Dolly or whip pan	Morph/transform	Palette swap	Color grading shift
Outro/Fade	Slow pull back	Simplifying	Cooling/dimming	Fade to black

Step 8: Iterate

Play the demo with the track
Where visuals feel early/late — adjust timestamps ±0.1-0.5s
Where energy feels flat — add audio multipliers or one-shot pulses
Where transitions feel abrupt — widen smoothstep ranges
Where transitions feel mushy — narrow smoothstep ranges
Repeat until every musical moment has a visual counterpart

Choreography Quality Checklist

Audio-Visual Sync System

The sync system is hybrid: hardcoded timestamps from the choreography map provide structure, real-time FFT adds organic responsiveness.

Time Source: Always Use `audio.currentTime`

The shader time uniform must come from audioElement.currentTime, NOT from a JS accumulator or performance.now(). This locks visuals to audio playback position even when frames drop or the browser throttles.

const time = audioEl ? audioEl.currentTime : 0;
gl.uniform1f(timeLoc, time);

FFT Data Pipeline

Extract frequency bands each frame and pass as a vec4 uniform:

analyser.fftSize = 512;                  // 256 bins, good latency/resolution balance
analyser.smoothingTimeConstant = 0.55;   // responsive but not jittery

const freq = new Uint8Array(analyser.frequencyBinCount);
analyser.getByteFrequencyData(freq);

let bass = 0;    for (let i=0;  i<4;  i++) bass    += freq[i]; bass    /= 4*255;
let mid = 0;     for (let i=4;  i<16; i++) mid     += freq[i]; mid     /= 12*255;
let treble = 0;  for (let i=16; i<64; i++) treble  += freq[i]; treble  /= 48*255;
let overall = 0; for (let i=0;  i<128;i++) overall += freq[i]; overall /= 128*255;

gl.uniform4f(audioLoc, bass, mid, treble, overall); // all normalized 0–1

Tuning:

fftSize: 512 is a good default. 256 = snappier but coarser. 1024+ = more frequency detail but more latency.
smoothingTimeConstant: 0.55 is balanced. 0.3 = twitchy (good for percussion- heavy tracks). 0.8 = sluggish (good for ambient/drone).

What Each Band Should Drive

Band	Freq range	Visual target	Example
`bass` (audio.x)	Sub/kicks	Camera shake, bloom pulse, ring glow	`col += glow * audio.x * 0.4`
`mid` (audio.y)	Melody/pads	Color shifts, surface detail	`col = 1.0 + audio.y 0.15`
`treble` (audio.z)	Hi-hats/shimmer	Scanline intensity, sparkle	`br = 0.3 + audio.z 0.7`
`overall` (audio.w)	Full energy	Bloom intensity, global glow	`bloom = 1.2 + audioE 0.4`

Critical Rule: Audio Never Drives SDF Geometry

Positions, sizes, angles, and shapes must be deterministic from time. Audio reactivity only affects materials, brightness, glow, and post-processing. This prevents visual jitter and ensures the demo looks correct even with audio analysis variance across devices.

Multiplier Pattern

Always use audio as a multiplier on a base value, never as the sole driver:

// CORRECT — works without audio, enhanced with it
col *= 1.0 + audio.x * 0.2;
bloom * (1.2 + audioE * 0.4);

// WRONG — black screen when audio is silent
col *= audio.x;

Camera Shake

Square the bass energy so only strong hits register:

float shake = audio.x * audio.x * 0.003;

Choreography: Phase-Based Timeline

Phase Struct

Define a GLSL struct encoding the current act. Each field is a smoothstep blend (0→1). The struct fields match sections from your choreography map:

struct Phase {
  float emerge;     // opening — geometry fades in
  float build;      // tension rising
  float ignite;     // the drop
  float system;     // full scene, cruising
  float storm;      // intensity peak
  float breakdown;  // momentary collapse
  float transform;  // mood/texture shift
  float reveal;     // resolution, finale
  float pulse;      // audio-reactive multiplier (1.0 + audio)
};

Adapt the field names and count to your track. A 2-minute ambient piece might have 3 phases. A 5-minute EDM track might have 10+.

Phase Transitions

// Snap transition (drop): 0.2s fade
ph.drop = smoothstep(dropTime - 0.1, dropTime + 0.1, t);

// Smooth transition (mood shift): 2s crossfade
ph.cold = smoothstep(shiftTime - 1.0, shiftTime + 1.0, t);

// Section with start and end (breakdown): ramp in, hold, ramp out
ph.breakdown = smoothstep(bdStart, bdStart + 0.2, t)
             * (1.0 - smoothstep(bdEnd - 0.3, bdEnd, t));

// Gradual build over 25s
ph.build = smoothstep(buildStart, buildEnd, t);

One-Shot Events

float pulse = exp(-abs(t - hitTime) * sharpness);

Dynamic Geometry with Phases

Geometry grows/shrinks by adding phase-gated terms:

radius += ph.build * 0.8;              // grows during build
radius += ph.storm * (0.5 + i * 0.2); // expands per-instance in storm
radius *= 1.0 - ph.breakdown * 0.3;   // shrinks during breakdown

Each term is gated by its phase so effects don't leak into other sections.

Camera Choreography

Camera position as a sum of phase-gated terms:

float camDist = initialDist
  - approach   * smoothstep(tA, tB, t)     // move in
  + pullBack   * smoothstep(tC, tD, t)     // move out
  - rushIn     * smoothstep(tE, tF, t);    // final approach

Each line is one camera move. Easy to read, reorder, and adjust independently. The camera should tell the same story as the music — approaching during builds, snapping on drops, drifting during calm sections, pulling back for the finale.

SDF Raymarching Patterns

Preventing Geometry Intersection

When objects must not intersect (e.g., rings around a star), compute the inner object's radius dynamically and clamp:

float innerR = getInnerRadius() + clearance;
outerR = max(outerR, innerR + index * gap);

Surface Detail via Noise

Layer noise at different frequencies for organic surfaces:

float detail = noise(p * 5.0 + t * 0.3) * noise(p * 7.0 - t * 0.5);
col *= 0.75 + detail * 0.5;

Animate noise offsets with t at different speeds per layer for convincing motion.

Post-Processing

Bloom

Bright extract → separable Gaussian blur → additive composite:

vec3 bloom = texture(blurTex, uv).rgb;
col += bloom * (1.2 + audioE * 0.4);

Glitch Effects

UV displacement gated to timestamp windows from the choreography map:

uv.x += step(0.99, sin(uv.y * 400.0 + t * 50.0)) * glitchAmt * 0.02; // scanline tears
uv.x += (floor(sin(uv.y * 8.0 + t * 3.0) * 4.0) / 4.0) * glitchAmt * 0.03; // block shifts

Fade to Black

float fade = smoothstep(endTime - fadeDuration, endTime, t);
col *= 1.0 - fade;

Common Pitfalls

Audio context requires user gesture — always gate on a click-to-start overlay
Shader compilation errors are silent — check gl.getShaderInfoLog() and log it
smoothstep needs margin — smoothstep(10.0, 10.5, t) not step(10.0, t) for clean cuts
Phase overlaps are features — two phases at 50/50 during crossfade = natural transition
Audio uniform fallback — if analyser is null, return [0,0,0,0] so shaders still work
Use onended not timeouts — actual playback may differ from expected track length
Test without audio — the multiplier pattern ensures visuals degrade gracefully
Choreography map first, code second — changing timestamps in code without updating the map leads to drift between intent and implementation

lovelaced/SKILL.md