AudioUtils

Audio Compression Explained: File Size vs Dynamic Range

Audio compression means two completely different things. One shrinks file sizes (MP3, AAC, FLAC). The other tames loud and quiet parts in a mix. Here is the honest difference, with codec specs, dB numbers, and which tool you actually need.

If you searched "audio compression explained," you may have ended up in the wrong rabbit hole. The phrase covers two completely different technologies that share nothing except a name. One is a data-encoding problem solved by codec engineers. The other is a mixing effect controlled by knobs on a plugin. Confusing the two wastes hours, so this guide draws the line first and then explains each side honestly.

The Two Meanings of "Audio Compression"

File-size compression (also called data compression or audio data compression) is the process of encoding an audio signal so it occupies less storage or bandwidth. MP3, AAC, Opus, FLAC, and ALAC are all audio data compressors. The output is a smaller file that, when decoded, plays back as audio. This is what you do when a podcast is too big to email, when a music library is too large for a phone, or when a website needs to stream music to thousands of listeners simultaneously.

Dynamic-range compression (DRC) is a mixing and mastering effect that reduces the loudness difference between the loudest and quietest parts of an audio signal. A compressor plugin in a DAW (Audacity, Logic, Pro Tools, Reaper, Ableton) attenuates audio above a threshold according to a ratio. The output is the same length, the same sample rate, the same file size — but the dynamics are squashed so vocals sit consistently in a mix or a podcast does not yo-yo between whispers and shouts.

The two share the word "compression" because both reduce something. File-size compression reduces bytes. Dynamic-range compression reduces decibels. Beyond that, they have nothing in common.

Quick Decision: Which One Do You Need?

| Goal | You need | Right tool | |---|---|---| | Shrink an MP3 to email it | File-size compression | /audio-compressor | | Make a podcast voice sound consistent | Dynamic-range compression | DAW (Audacity, Logic, Reaper) | | Save disk space on a music library | File-size compression (lossy or lossless) | /wav-to-mp3 or /wav-to-flac | | Stop loud drums from clipping the master | Dynamic-range compression | DAW | | Cut a 100 MB WAV down to 8 MB | File-size compression | /compress-wav | | Even out a singer who keeps drifting back from the mic | Dynamic-range compression | DAW |

If your problem is "this file is too big," everything below the next heading is what you want. If your problem is "loud parts are too loud," skip ahead to the Dynamic-Range Compression section.

Part 1: File-Size Compression

The audio data on a CD is stored as PCM (Pulse-Code Modulation) — uncompressed digital samples, 44,100 per second per channel, each 16 bits wide. That works out to 1,411,200 bits per second, or 10.6 MB per minute, or 635 MB per hour. A 4-minute song is 42 MB raw. That is fine for an album sleeve in your hand, terrible for streaming a playlist over a 4G connection in 2008.

Compression codecs solve this. They fall into two camps.

Lossy Compression — Discards What You Cannot Hear

A lossy codec analyzes the signal, decides which parts of it human ears probably will not notice, and discards that data. The decoder cannot reconstruct the original — the discarded samples are gone forever — but the perceived sound is close enough that most listeners cannot tell the difference at typical bitrates.

The science behind this is psychoacoustics, the study of how the ear and brain process sound. Lossy codecs exploit several phenomena:

  • Frequency masking: a loud tone at 1 kHz hides a quieter tone nearby (say, at 1.1 kHz) for the duration it sounds. The codec drops the masked tone.
  • Temporal masking: a loud transient briefly raises the hearing threshold a few milliseconds before and ~100 ms after it. Quieter sounds in that window are dropped.
  • Absolute threshold of hearing: signals below the audibility floor at any given frequency are removed outright.
  • Stereo redundancy: most low frequencies in stereo recordings are nearly identical between channels. Joint-stereo modes encode the shared content once.

The encoder runs all of this in real time, then quantizes (rounds) the remaining frequency-domain coefficients more aggressively in regions where small errors will be inaudible. The output bitrate is the budget — fewer bits per second means harsher quantization and audible artifacts.

#### Major lossy codecs

| Codec | Year | Typical bitrate | Strength | |---|---|---|---| | MP3 (MPEG-1 Layer III) | 1993 | 128–320 kbps | Universal compatibility, mature encoders (LAME) | | AAC (Advanced Audio Coding) | 1997 | 96–256 kbps | Better than MP3 at the same bitrate; default on iPhone, YouTube, Netflix | | Vorbis | 2000 | 96–256 kbps | Open-source, used in older Spotify, video games | | Opus | 2012 | 32–256 kbps | Best low-bitrate codec; used by WhatsApp, Discord, Zoom, modern Spotify |

For a deeper look at MP3 specifically, see what is MP3. For the format trade-offs at different speeds, see audio bitrate explained.

#### Transparency thresholds

A bitrate is transparent when listeners cannot reliably tell the compressed file from the original in an ABX test (a blind A/B comparison). Real-world transparency thresholds for music on decent gear:

  • MP3 (LAME): ~192 kbps VBR (preset V2) for most listeners; ~256 kbps for trained ears
  • AAC: ~128 kbps for most listeners; ~192 kbps for trained ears
  • Opus: ~96 kbps for most listeners; ~128 kbps for trained ears
  • Vorbis: ~160 kbps for most listeners

Below transparency you start hearing artifacts: pre-echo on transients (drums, plucks), warbling on cymbals, smeared stereo image, hollow-sounding vocals. Above it, more bitrate is wasted bytes.

Lossless Compression — Bit-Perfect Reduction

A lossless codec compresses audio data with no loss whatsoever. Decode the file and you get the exact same PCM samples that went in — every sample, bit for bit. This works the same way ZIP works on text: identifying redundancy and encoding it more efficiently. A typical music track compresses to 50–60% of WAV size — a 50 MB WAV becomes a 25–30 MB FLAC.

Lossless codecs use linear prediction (the next sample is mostly predictable from previous samples; encode only the difference) plus entropy coding (Rice or Golomb coding for the residuals).

#### Major lossless codecs

| Codec | Origin | Compression ratio | Notes | |---|---|---|---| | FLAC | 2001 | ~50–60% of WAV | Open-source, the de facto standard for archival | | ALAC (Apple Lossless) | 2004 | ~50–60% of WAV | Native on Apple devices, technically similar to FLAC | | WavPack | 1998 | ~55–65% of WAV | Lossless plus optional hybrid mode | | Monkey's Audio (APE) | 2000 | ~50% of WAV | Slightly tighter than FLAC, slower decode, Windows-centric |

If you have WAV masters and want to keep every bit while saving half the disk space, FLAC is the standard answer. See what is FLAC and lossless vs lossy for the long-form trade-offs.

Bitrate Concepts

Bitrate is the number of bits the encoder allocates per second of audio. It comes in three flavors:

  • CBR (Constant Bitrate): every second gets the same byte budget. Predictable file sizes, but wasteful — silent passages get the same allocation as dense passages.
  • VBR (Variable Bitrate): the encoder spends more bits on complex passages and fewer on simple ones. Better quality per byte. The downside: file size is unpredictable until encoding finishes, and seeking in old players can misbehave.
  • ABR (Average Bitrate): a VBR variant that targets a long-run average. A compromise.

For most modern listening, VBR is the right default. LAME's V2 preset (MP3 VBR averaging ~190 kbps) is a sensible target for music. See VBR vs CBR MP3 for the deeper trade-off.

When File-Size Compression Is Worth It

  • Sending a recording over email, Slack, Discord, or WhatsApp (most enforce 25–100 MB limits)
  • Storing a music library on a phone (a 1,000-song library is 4 GB at 320 kbps MP3 vs 40 GB at WAV)
  • Streaming audio over the open internet (Spotify ships at 96–320 kbps Vorbis/AAC for a reason)
  • Hosting a podcast (an hour of voice at 96 kbps mono MP3 is ~43 MB; at WAV it would be 600 MB)

If any of those sound like your problem, drop a file into /audio-compressor or pick the format-specific tool (/compress-mp3, /compress-wav, /compress-m4a, /compress-ogg, /compress-flac).

Part 2: Dynamic-Range Compression {#dynamic-range-compression}

Now the other meaning. Open Audacity, Logic, Reaper, or any DAW, search the effect list, and you will find an "audio compressor" plugin. This compressor does not shrink file size. It changes how loud different parts of the audio are relative to each other.

What a Compressor Does

A dynamic-range compressor monitors the input level. When the input crosses a threshold (a level you set, in dBFS), the compressor reduces the gain by an amount controlled by the ratio. A 4:1 ratio means: for every 4 dB the input goes above threshold, the output only goes 1 dB above threshold. The loud parts get pulled down, but the quiet parts pass through unchanged. After compression, you typically apply makeup gain to bring the whole track up to the level you want.

The Five Knobs

Every compressor — analog or digital, vintage or modern — uses the same five parameters:

1. Threshold (dBFS): the level at which compression begins. -20 dBFS means anything quieter than that passes through untouched. 2. Ratio (X:1): how aggressively the compressor pulls down audio above threshold. 2:1 is gentle. 4:1 is moderate. 10:1 is heavy. 20:1+ is limiting. 3. Attack (ms): how fast the compressor reacts after the signal crosses threshold. Fast attack (~1 ms) catches transients. Slow attack (~30–100 ms) lets initial peaks through and only clamps the sustain. 4. Release (ms): how fast the compressor lets go after the signal drops back below threshold. Fast release (~50 ms) preserves dynamics. Slow release (~500 ms) keeps the compression active longer for a smoother sound. 5. Makeup gain (dB): the post-compression gain stage that compensates for the level loss.

Common Dynamic-Range Compression Use Cases

  • Vocals: 3:1 ratio, -18 dBFS threshold, 5 ms attack, 80 ms release. Catches the loud syllables, evens out delivery.
  • Drum bus: 4:1 ratio, slow attack (30 ms) to let snare crack through, then clamp the body.
  • Podcast voice: 4:1, -20 dBFS, 10 ms attack, 100 ms release, then a limiter on the master to prevent peaks above -1 dBTP.
  • Master bus glue: 1.5:1 to 2:1, -10 dBFS threshold, slow attack, slow release. Almost invisible, just nudges everything together.

Why DRC Exists

Recorded sound has huge dynamic range. A whisper next to a snare hit can span 40 dB. Real listening environments — earbuds in a noisy subway, car stereo over engine noise — cannot reproduce that range cleanly. DRC tames the peaks so the average level can sit higher, making everything audible without redlining the speakers on the loud bits.

It is also the reason modern pop sounds the way it does. The "loudness war" of the 2000s pushed mastering compression so hard that quiet detail disappeared entirely. Streaming-era loudness normalization (Spotify's -14 LUFS target, Apple's -16 LUFS) has eased that pressure — over-compressed masters get turned down by the platform anyway.

DRC Is Not in AudioUtils' Scope

Dynamic-range compression is a creative production task. It needs ears, monitoring, and iteration — drag the threshold, listen, drag the ratio, listen, undo, try again. That workflow lives in a DAW, not in a single-purpose web utility. AudioUtils handles file-size compression and format conversion. For DRC, install Audacity (free, beginner-friendly), Reaper (cheap, deep), or Logic / Ableton / Pro Tools if you do this professionally.

Why People Confuse the Two

Three reasons:

1. The English word "compress" means "make smaller" in both contexts — bytes vs decibels — and most people only learn one of the two meanings. 2. Some software unhelpfully labels both with the same word. Audacity has "Effect → Compressor" (DRC) and "File → Export Audio" with bitrate selection (file size). Both buttons say "compress." 3. Marketing copy on every audio tool — including, historically, ours — uses "compression" without disambiguation.

Now that you know the difference, you can read product pages and forum threads more accurately.

Honest Verdict

If your search led here because a file is too large, what you want is an audio file compressor. The fastest path: /audio-compressor for any format, or the format-specific tools at /compress-mp3 and /compress-wav. Pick a target bitrate — 192 kbps for music, 96 kbps for voice — and the file shrinks to a fraction of its original size in seconds, in your browser.

If your search led here because a recording sounds uneven, what you want is a dynamic-range compressor plugin in a DAW. Audacity is free and good enough for podcasts. Logic, Reaper, and Pro Tools are the next step for music production.

For the deeper format-by-format trade-offs, see lossless vs lossy, what is MP3, what is AAC, and audio bitrate explained.