Putting back the bass that was never recorded

The rig is a problem if you're not careful. Two TPA3255 amps, somewhere around 1200 watts into the mains, an ESS DAC feeding them, and a pair of powered 15-inch Velodyne subs that bring the total to roughly 1500 watts. Enough to feel in your sternum. The job was to prep a music set that would actually hit on this system without wrecking the tracks that didn't need help.

That last clause is the whole discipline. Shoving a smile curve into everything is easy. Leaving a good master alone is harder. So I gave myself the same rule I give myself for anything I touch: measure first, change only what the measurement justifies, revert where it doesn't. Good mastering isn't taste applied confidently. It's measurement applied honestly.

Getting the bits in the first place

Before any DSP there was a sourcing problem. Keeping this short because it's the least interesting part. The downloader I was using was silently capping output to 320k AAC while reporting success. That's the sneakiest kind of bug: it doesn't fail, it just quietly hands you something worse than you asked for. The real lossless path came down as a DASH MPD manifest, so I had to parse the manifest, pull the right representation, and remux the fragmented MP4 stream into FLAC with ffmpeg.

remux.sh

# The lossless representation arrives as fragmented MP4 inside a DASH manifest.
# No re-encode: copy the audio stream straight into a FLAC container.
ffmpeg -i "$FRAGMENTED_MP4" -c:a flac -compression_level 8 "out.flac"

No credentials in this post. The interesting work is downstream of having real bits. Assume from here that every input is true lossless.

The deficit you have to prove before you fix it

A lot of music from the 1960s and 1970s is genuinely rolled off below 40 Hz. Cut for vinyl, mixed on speakers that couldn't reproduce that octave anyway, so there was no point putting energy there. On a normal system you'd never notice. On a pair of 15-inch subs it means the most capable part of the rig has nothing to play. The drivers just sit there on tracks that should be shaking the room.

The tempting move is to reach for a bass boost and call it done. The right move is to first ask whether the bass is actually missing, because a low-shelf on a track that already has full low end just makes it boomy and clipped.

So I measured the sub-40 Hz energy in every candidate before touching anything. "Rolled off at the source" and "I want more bass" are different claims. Only one of them is a fact.

sub_energy.py

import numpy as np, soundfile as sf
from scipy.signal import butter, sosfiltfilt
 
def sub_band_db(path, hi=40.0):
    x, sr = sf.read(path, always_2d=True)
    mono = x.mean(axis=1)
    full_rms = np.sqrt(np.mean(mono**2)) + 1e-12
    # Energy that lives strictly below 40 Hz, the octave the subs want.
    sos = butter(8, hi, btype="low", fs=sr, output="sos")
    sub = sosfiltfilt(sos, mono)
    sub_rms = np.sqrt(np.mean(sub**2)) + 1e-12
    # Ratio of sub-band energy to total, in dB. Very negative = no bass there.
    return 20.0 * np.log10(sub_rms / full_rms)

Tracks that came back deep in the negatives were real candidates. Tracks with substantial sub-band energy already were not. One of them I was sure needed help. It measured fine. The meter said leave it alone, so I left it alone. No treatment. That revert isn't a footnote, it's what makes the process a process instead of a preference wearing a lab coat.

The discipline in one line

A bass boost on a track that lacks bass restores it. The same boost on a track that already has bass just clips and booms. You can't tell which case you're in by listening at high volume. You can tell by measuring the sub-band first.

Synthesizing the missing octave

Once a deficit's proven, the question is how to put energy into an octave that was never recorded. You can't EQ up what isn't there. Nothing below 40 Hz to amplify. You have to generate it from what the track does have, which is the bass an octave higher.

Four steps. Isolate the bass band so you're working only with the low frequencies. Pitch it down one octave with a time-preserving algorithm so the new content lands an octave lower without changing the song's length or tempo. Low-pass the pitched result below about 75 Hz to keep only the new sub content and not a muddy doubled midbass. Then gate it, so the synthesized sub only speaks on actual bass notes and stays silent on noise, hiss, and bleed between notes. Blend it back under the original at a level the measurement supports.

The pitch shift is what matters most. ffmpeg's rubberband filter does it while preserving duration exactly.

synth_sub.sh

# 1. Isolate the bass band, 2. drop it an octave (duration preserved),
# 3. keep only true sub content, 4. gate so it speaks on notes not noise.
ffmpeg -i in.flac -af "
  lowpass=f=180,
  rubberband=pitch=0.5,
  lowpass=f=75,
  agate=threshold=0.02:ratio=4:attack=15:release=120
" sub_layer.flac
 
# Blend the synthesized octave UNDER the original at a measured level.
ffmpeg -i in.flac -i sub_layer.flac \
  -filter_complex "[1:a]volume=-7dB[s];[0:a][s]amix=inputs=2:weights=1 0.7" \
  out.flac

The gate isn't optional. Without it, the pitched-down layer reproduces everything in the bass band including the spaces between notes, which on a 15-inch driver turns into a constant low rumble that smears the whole track. With the gate set to open only on real note onsets, the sub layer behaves like an instrument playing along with the bassline instead of a fog sitting under it.

The chain-order bug, or why DSP is not commutative

It's my favorite kind of bug, because the symptom lies about the cause.

I built the synthesis, measured a clean deficit, blended the new octave in. The boost wasn't in the output. Re-ran it. Still gone. Sub-band measurement on the rendered file looked almost identical to the untreated input, as if the synthesis stage had done nothing. Every individual stage worked fine in isolation. Assembled into the full chain, the bass evaporated.

The cause was order. Bass synthesis was running before an RMS glue compressor sitting later in the mastering chain. The compressor was doing exactly its job: it saw the new low-frequency energy, decided the track had gotten louder, and pulled it back down. I'd been generating sub-bass and feeding it straight into a stage whose entire purpose is to remove energy that stands out. It squashed out precisely what I'd just added.

The chain, before and after

BROKEN:  isolate -> pitch -> gate -> blend -> [RMS glue comp] -> limiter
                                       ^ boost added here, then eaten here ^
 
FIXED:   isolate -> pitch -> gate -> [RMS glue comp] -> blend -> limiter
                                                          ^ boost survives ^

Moving the blend after the glue compressor fixed it completely. The compressor sets the dynamics of the original track, and the synthesized sub gets laid in underneath afterward, where nothing downstream is trying to flatten it. DSP is not commutative: the same stages in a different order are a different processor, and the meter is the only thing that tells you which order you actually built.

Loudness forensics

The source masters weren't consistent with each other. About an 8 dB spread in integrated loudness measured against EBU R128, which means on a fixed volume knob some tracks arrive timid and others arrive shouting. Worse, several were clipping at the inter-sample level, true peaks up around +2.29 dBTP. Sample peaks under 0 dBFS, true peaks well over. That's the classic signature of a master pushed into a limiter and never checked on a true-peak meter.

I measured both, integrated LUFS and true peak, on every track before deciding anything.

loudness_report.py

import subprocess, json, re
 
def measure(path):
    # ffmpeg's loudnorm in measurement mode reports R128 + true peak.
    cmd = ["ffmpeg", "-i", path, "-af",
           "loudnorm=I=-14:TP=-1.0:LRA=11:print_format=json",
           "-f", "null", "-"]
    out = subprocess.run(cmd, capture_output=True, text=True).stderr
    blob = re.search(r"\{.*\}", out, re.S).group(0)
    m = json.loads(blob)
    return float(m["input_i"]), float(m["input_tp"])  # LUFS, dBTP

Rather than force one loudness target on everything, I rendered three sets for three rooms.

Lossless archive

untouched

full dynamics, the canonical copy

Dinner set

-14 LUFS

conversation-friendly, R128 streaming target

Dance set

-11 LUFS

hotter, for when the rig earns its watts

The archive keeps the full dynamic range and the synthesized bass, no loudness normalization, because it's the master copy everything else descends from. The dinner set is normalized near -14 LUFS so it sits under conversation without competing with it. The dance set runs hotter at -11 LUFS for when the room's committed. Every set was true-peak limited to keep the inter-sample overs off the DAC, and the final bit-depth reduction used shibata noise-shaped dither so the quantization noise gets pushed up where the ear is least sensitive instead of sitting flat across the band.

Three targets, one measured source, no guessing about which one a given evening calls for.

Measurement, not ears

I put an octave of bass into songs that were recorded before anyone could hear it, and left the songs that didn't need it exactly as they were. The difference between those two outcomes wasn't my ears. It was the sub-band measurement that told me which case I was in, the chain order that decided whether the boost survived, and the loudness meter that disagreed with what I wanted often enough to keep me honest.

Getting the bits in the first place#

The deficit you have to prove before you fix it#

Synthesizing the missing octave#

The chain-order bug, or why DSP is not commutative#

Loudness forensics#

Measurement, not ears#