What Is Audio Latency? Definition and Guide
Latency is the delay between when audio enters a system and when you hear it. A few milliseconds are acceptable. Tens of milliseconds break live performance. Understanding where latency comes from — and how to reduce it — is essential for recording, streaming, and real-time audio work.
What Causes Audio Latency
Multiple factors contribute to audio latency. Buffer size in your audio interface and DAW is the largest contributor. Larger buffers process more samples at once, reducing CPU load but increasing delay. A buffer size of 256 samples at 44.1 kHz adds about 5.8 ms of latency. At 1024 samples, that becomes 23.2 ms — noticeable when singing or playing instruments while monitoring through headphones. Sample rate affects it too: 96 kHz processes samples twice as fast as 48 kHz, so buffer sizes produce half the latency in milliseconds. Driver architecture matters: ASIO on Windows is much lower latency than the default MME or WASAPI drivers.
Perceptible Latency Thresholds
Research from the Audio Engineering Society places the threshold of perceptible latency at around 10-12 ms for most listeners monitoring their own performance. Below 10 ms feels immediate. At 15-20 ms, singers and instrumentalists notice a slight echo effect. Above 25 ms, the delay becomes genuinely disruptive — it is harder to perform in time. For music production monitoring, aim for total round-trip latency under 10 ms. For video conferencing and internet audio, latency up to 150-200 ms is generally acceptable. For live performance and remote collaboration, under 50 ms one-way is a practical target.
Latency in Streaming and Codecs
Audio codecs introduce their own latency through algorithmic delay — the time required to encode and decode a frame of audio. MP3 has an inherent algorithmic delay of 1,105 samples (about 25 ms at 44.1 kHz), which caused problems with gapless playback and precise synchronization. AAC has a similar delay. Opus was specifically designed for low latency: in SILK mode it can achieve algorithmic latency as low as 5 ms, and in CELT mode it targets 5-22.5 ms. This is why Opus dominates real-time communication, and why WebRTC mandates Opus as the voice codec.
Reducing Latency in Your Setup
Use an audio interface with ASIO drivers on Windows — the Roland, Focusrite, and MOTU interfaces include well-optimized ASIO drivers. On macOS, the Core Audio framework provides low-latency access with buffer sizes as small as 32 samples. On Linux, JACK Audio Connection Kit with a real-time kernel achieves very low latency. In your DAW, reduce buffer size when recording live instruments or vocals. Increase it when mixing and processing, where CPU load matters more than latency. Close unnecessary background applications — they compete for CPU and cause audio dropouts.
Latency and File Conversion
File format conversion is not a real-time process, so codec latency does not directly affect the quality of converted files. However, codec algorithmic delay can cause the audio to start slightly later than expected in the output file. FFmpeg and AudioUtils handle this correctly by trimming the initial silence caused by codec delay. If you notice that converted files start a fraction of a second late compared to the source, the converter may not be compensating for encoder delay. This is most noticeable when converting short clips or audio that starts immediately at 0:00 with no silence before the content.