Every podcast, music stream, video call, and Bluetooth playback you hear today passes through an audio codec. Without one, a three-minute song would weigh in at roughly 30 MB, and a single hour of stereo audio would chew through more than 600 MB of bandwidth. So what is an audio codec, and why does it matter to anyone building or shipping a streaming product?
In short, an audio codec is the algorithm (or chip) that compresses raw audio for storage and transmission, then decompresses it for playback. It’s the difference between a 30 MB WAV file and a 3 MB MP3 that sounds nearly identical to most listeners.
This guide walks through the definition, encoding pipeline, lossy vs lossless tradeoffs, the most common codecs in use today (AAC, MP3, Opus, FLAC, ALAC, AC-3, LDAC, SBC, and more), how Bluetooth and web browsers handle them, and how to pick the right codec for music, voice, podcasts, or live broadcasts. By the end, you’ll know which codec to plug into your next streaming pipeline and why.
What Is an Audio Codec?
An audio codec is a piece of software or hardware that encodes raw audio data into a compressed digital stream, then decodes that stream back into a playable signal. The word itself is a portmanteau of *coder* and *decoder*. Some codecs also include the analog-to-digital and digital-to-analog conversion stages that turn microphone voltage into bits and bits back into speaker output.
| Attribute | Description |
|---|---|
| Purpose | Compress audio for storage/transmission, decompress for playback |
| Form | Software algorithm (FFmpeg, libopus) or hardware IC (Realtek ALC, Cirrus Logic CS) |
| Input | Raw PCM samples (typically 44.1 or 48 kHz, 16- or 24-bit) |
| Output | A compressed bitstream wrapped in a container (MP4, WebM, Ogg) |
| Compression ratio | 5:1 (lossless) up to 20:1 (lossy) without obvious quality loss |
| Common examples | AAC, MP3, Opus, FLAC, ALAC, AC-3, Vorbis |
Two distinct meanings of “codec” coexist in everyday use:
- Software codec — a program implementing an algorithm (such as the LAME MP3 encoder or libopus) that compresses and decompresses digital audio data, with the goal of representing a high-fidelity signal in as few bits as possible while keeping perceived quality intact.
- Hardware codec — a single chip on a motherboard, smartphone, or headset that contains an analog-to-digital converter (ADC), a digital-to-analog converter (DAC), and often the compression logic itself. Realtek’s ALC series and Cirrus Logic’s CS chips are common examples on PCs and Macs.
For more background on codecs in general, see the Audio codec entry on Wikipedia.
Audio Codec vs Audio Format vs Container
These three terms are often used interchangeably, which causes confusion. They’re not the same thing.
| Concept | What it is | Examples |
|---|---|---|
| Audio codec | The algorithm that compresses and decompresses audio data | AAC, MP3, Opus, FLAC, AC-3 |
| Audio coding format | The specification that defines the bitstream syntax produced by the codec | AAC LC, MP3, Opus, FLAC |
| Container (file format) | The wrapper that stores one or more audio (and sometimes video) bitstreams plus metadata | MP4, WebM, Ogg, MKV, WAV, FLAC |
The blurry edge: some names refer to both a codec and a container. MP3 is a codec *and* a file extension, because the only thing typically inside an `.mp3` file is an MP3 audio stream. FLAC is the same — it names both a lossless codec and a native container. AAC, by contrast, is almost always packed inside an MP4 (or M4A) container, never in a file called `.aac`.
A practical rule: when you read “the file uses AAC,” that’s the codec. When you read “the file is an `.mp4`,” that’s the container. The container can hold multiple codecs — for instance, an MP4 file might carry an H.264 video stream alongside an AAC audio stream. If you’re working with video formats and codecs together, this distinction matters when you transcode or mux streams.
How Does an Audio Codec Work?
An audio codec runs a five-stage pipeline. Encoding compresses; decoding reverses the process to play the audio back.
1. Sampling and quantization. A microphone produces a continuous analog wave. The ADC samples that wave thousands of times per second (44,100 Hz for CD quality, 48,000 Hz for most digital video) and represents each sample as a 16- or 24-bit integer. The result is raw PCM (pulse-code modulation) audio.
2. Transform. The encoder breaks the PCM stream into short overlapping windows (typically 20 ms) and applies a mathematical transform — usually the modified discrete cosine transform (MDCT) — to convert the time-domain signal into the frequency domain. This exposes which frequencies carry the most energy.
3. Psychoacoustic modeling (lossy codecs only). The encoder uses a model of human hearing to identify sounds you can’t perceive: quiet tones masked by louder ones, frequencies above ~20 kHz, very brief sounds buried under longer ones. These bits are discarded.
4. Quantization and entropy coding. Remaining frequency coefficients are quantized (rounded to a smaller set of values) and packed using entropy coding (Huffman or arithmetic) to squeeze out redundancy. The output is a compact bitstream.
5. Decoding. On the playback side, the decoder reverses each step: parse the bitstream, undo the entropy coding and quantization, run the inverse MDCT, and feed the reconstructed PCM samples to a DAC. The DAC drives the speaker or headphone amplifier.
Voice codecs add a sixth technique called linear predictive coding (LPC), which models the human vocal tract and transmits only the parameters needed to reconstruct speech. LPC is what lets G.711 and SILK (the speech mode inside Opus) work at extremely low bitrates — under 16 kbps for intelligible voice.
The whole encode-decode round trip introduces a delay called algorithmic latency: about 100 ms for MP3, 20–405 ms for AAC, and as little as 5 ms for Opus. That difference matters if you’re building real-time voice apps — see our guide on low-latency streaming for why every millisecond counts.
Lossy vs Lossless Audio Codecs
The single biggest decision when picking an audio codec is whether you can tolerate any loss of fidelity. Two camps exist.
Lossy codecs remove information the human ear is unlikely to notice. The compressed file can be 80–95% smaller than the original PCM, but the discarded data cannot be recovered. AAC, MP3, Opus, and Vorbis are all lossy.
Lossless codecs compress audio without throwing anything away. Decoding produces a bit-perfect copy of the input. File sizes shrink by roughly 40–60%, which is significant but nowhere near what lossy codecs achieve. FLAC, ALAC, and Apple’s lossless mode are examples.
| Attribute | Lossy | Lossless |
|---|---|---|
| Compression ratio | 10:1 to 20:1 | 1.5:1 to 2.5:1 |
| File size (3-min song) | 3–8 MB | 15–25 MB |
| Quality vs original | Perceptually similar; data is gone forever | Bit-for-bit identical |
| CPU cost to decode | Low | Slightly higher |
| Best for | Streaming, mobile, voice, broadcast | Archival, mastering, audiophile playback |
| Examples | AAC, MP3, Opus, Vorbis, AC-3 | FLAC, ALAC, WavPack, Monkey’s Audio |
A third group, uncompressed formats like WAV and AIFF, store raw PCM with no compression at all. They’re simple and lossless but enormous — about 10 MB per minute of CD-quality stereo. Studios and broadcast trucks still use them for editing because uncompressed audio is the easiest to splice and mix.
Common Audio Codec Types
Below are the codecs you’ll meet most often, with the bitrates and use cases that matter when building or shipping audio products.
AAC (Advanced Audio Coding)
AAC is the de facto standard for streaming, broadcast, and mobile audio. It supports up to 48 main channels plus 16 low-frequency effects channels, bitrates from 8 kbps to 512 kbps, and is the audio codec used by YouTube, Apple Music, iTunes, and most HLS streams. AAC LC (Low Complexity) is the workhorse profile; HE-AAC and HE-AAC v2 add efficiency for very low bitrates. If you’re building HLS pipelines, AAC is almost always the default — see our guide on HLS streaming for the protocol context.
MP3 (MPEG-1 Audio Layer III)
MP3 is the codec that made digital music portable. Bitrates run from 8 kbps mono up to 320 kbps stereo. Patent restrictions expired in 2017, so MP3 is now royalty-free everywhere. It’s universally supported on every device made in the last 25 years. The downside: at the same bitrate, AAC and Opus sound noticeably better. MP3 is best treated as the lowest-common-denominator format — fine for downloads but rarely the best technical choice for new pipelines.
Opus
Opus is the modern general-purpose lossy codec. It scales from 6 kbps (intelligible voice) to 510 kbps (transparent music) and runs at latencies between 5 ms and 66.5 ms — low enough for real-time voice. It’s the default codec for WebRTC, Discord, Zoom, and YouTube’s WebM streams. Opus is open and royalty-free, defined in RFC 6716. If you’re building WebRTC live streaming, Opus is the only codec you should plan around.
Vorbis
Vorbis (often shipped in `.ogg` containers) is an older open-source lossy codec from the Xiph.Org Foundation. It outperforms MP3 at similar bitrates and was popular in games and open-source pipelines, but Opus has largely replaced it for new work.
FLAC (Free Lossless Audio Codec)
FLAC is the most widely supported lossless codec. It compresses CD-quality stereo to roughly half the size of an equivalent WAV without losing a single bit. Encoding latency is 4.3–92 ms depending on settings, and decoding is fast enough for any modern device. Use FLAC for archives, mastering libraries, and high-resolution music distribution.
ALAC (Apple Lossless Audio Codec)
ALAC is Apple’s lossless codec — functionally similar to FLAC, but native to iTunes, Apple Music’s lossless tier, and the Apple ecosystem. Apple open-sourced it in 2011, so encoders and decoders exist on every platform, but FLAC remains more common outside Apple devices.
AC-3 and E-AC-3 (Dolby Digital)
AC-3 carries the surround-sound audio on DVDs, ATSC television, and many streaming services. It supports up to 5.1 channels at bitrates from 32 kbps to 640 kbps. E-AC-3 (Dolby Digital Plus) extends the family to 7.1 channels and higher bitrates, and is the default surround codec on Netflix and Amazon Prime Video.
WAV and AIFF
Both are uncompressed PCM containers — WAV is the Microsoft standard, AIFF the Apple equivalent. Neither applies any compression, so a 3-minute stereo file is roughly 30 MB. They’re standard for studio work because every editor reads them and there’s no decode overhead.
LDAC
LDAC is Sony’s high-bitrate Bluetooth codec, capable of 990 kbps over Bluetooth — about three times the bitrate of standard SBC. It’s required for “Hi-Res Audio Wireless” certification and is supported on Android 8 and later. Range and battery life take a hit compared to lower-bitrate codecs.
SBC (Subband Codec)
SBC is the mandatory baseline codec for Bluetooth A2DP. Every set of Bluetooth headphones supports it. Quality at 328 kbps is acceptable but obviously inferior to AAC, aptX, or LDAC. Treat it as the safety-net codec — always available, rarely the best choice.
G.711 and G.722
These are telephony codecs. G.711 (μ-law and A-law) runs at a fixed 64 kbps with 0.125 ms latency and is required by every WebRTC implementation. G.722 doubles the audio bandwidth to wideband (50 Hz – 7 kHz) at the same bitrate, with 4 ms latency. Both are everywhere in VoIP and SIP traffic.
For full bitrate, channel, and latency tables, the Web audio codec guide on MDN is the best technical reference.
Hardware vs Software Audio Codecs
The same word, “codec,” covers two very different things.
A software audio codec is just code — an algorithm running on a CPU. FFmpeg, libopus, the LAME MP3 encoder, and the AAC encoder built into Apple’s CoreAudio are all software codecs. They live in your encoding pipeline, your media player, your browser, your phone’s media framework. Software codecs are flexible: you can update them, swap them, run multiple of them, and pick configuration profiles per stream.
A hardware audio codec is a physical chip that contains an ADC, a DAC, and often a small DSP that runs the compression algorithm in silicon. Realtek ALC, Cirrus Logic CS, IDT (now Tempo Semiconductor) parts, and SigmaTel chips have all played this role on PC motherboards and laptops. Smartphones and Bluetooth headsets ship dedicated audio codec ICs that handle Bluetooth SBC, AAC, and aptX in hardware to save battery. The codec chip on a USB audio interface is what turns a microphone signal into a digital stream that your computer can record.
In a typical streaming pipeline, both are present. A hardware codec digitizes the microphone signal, then a software codec compresses it for transmission. On the receiving end, software decodes the bitstream back to PCM, then a hardware codec converts that PCM to analog for the speaker.
Bluetooth Audio Codecs Explained
Bluetooth is its own little codec ecosystem because radio bandwidth is limited and battery life matters. Five codecs do most of the work today.
| Codec | Bitrate | Latency | Best for |
|---|---|---|---|
| SBC | 192–328 kbps | 100–270 ms | Mandatory baseline, every headset |
| AAC | 128–256 kbps | 100–200 ms | iPhone, default Apple ecosystem |
| aptX / aptX HD | 352 kbps / 576 kbps | 60–150 ms | Android, near-CD quality |
| aptX Low Latency | 352 kbps | ~40 ms | Gaming, video sync |
| LDAC | 330 / 660 / 990 kbps | ~150 ms | Sony, Hi-Res Audio Wireless |
| LC3 | 160–345 kbps | ~30 ms | LE Audio, hearing aids |
The newest entry, LC3, is the codec at the heart of Bluetooth LE Audio. It delivers SBC-or-better quality at roughly half the bitrate, which means longer battery life and the ability to broadcast a single audio stream to many receivers (Auracast). Expect LC3 to displace SBC over the next several years.
What gets used at any moment depends on negotiation: when a phone connects to headphones, both ends advertise the codecs they support, and the higher-tier match wins. iPhones default to AAC; most Android phones default to SBC unless the user enables aptX or LDAC in developer options.
Audio Codecs in Live Streaming
Live streaming pipelines have a narrower codec menu than file-based audio. Two factors drive the choice: which codec the streaming protocol allows, and what every receiving device can decode without transcoding.
RTMP historically carried MP3 or AAC audio. AAC has won — modern RTMP encoders and most RTMP servers default to AAC LC at 96–192 kbps stereo. If you’re running an RTMP server, expect every incoming stream to use AAC.
HLS carries AAC for VOD and live, plus AC-3 and E-AC-3 for multi-channel surround. The HLS spec also permits raw audio as `.aac` segments for audio-only streams. AAC is the only audio codec you can rely on for reach across iOS, Android, smart TVs, and set-top boxes.
SRT is transport-agnostic — it doesn’t care which audio codec you carry inside the MPEG-TS payload. AAC and AC-3 are typical. If you’re building broadcast workflows over the SRT protocol, the codec choice follows whatever your downstream tooling expects.
WebRTC restricts you to two codecs: Opus and G.711. Opus is the default for everything except telephony interconnect. Latency under 100 ms end-to-end depends on Opus’s short frames and low algorithmic delay.
MPEG-DASH and CMAF mirror HLS — AAC for stereo, AC-3 / E-AC-3 / AC-4 for surround. CMAF specifically allows the same fragmented-MP4 segments to be packaged for both HLS and DASH playback.
LiveAPI handles the audio side of live streaming for you — it ingests RTMP, SRT, and RTSP streams (with AAC audio), packages them into HLS for global delivery via Akamai, Cloudflare, or Fastly, and stores the audio track unchanged so live streams roll over to VOD without re-encoding. If you need an audio-only stream (a podcast or sports radio broadcast), LiveAPI’s audio-only mode keeps the AAC track and skips video processing entirely. See our guide on building a video streaming app for the full picture, or the video transcoding API page for transcoding details.
So far, we’ve covered the theory and the codec menu. The next question is the practical one: given your specific use case, which codec should you actually pick, and how do you avoid the most common playback errors? The remaining sections answer those questions.
How to Choose an Audio Codec
Five questions narrow the choice down to one or two codecs.
1. What’s the use case? Streaming music, voice/VoIP, podcast download, archival, or broadcast each have a default winner. Music streaming → AAC. VoIP / real-time voice → Opus or G.722. Podcast downloads → AAC at 96 kbps mono or 128 kbps stereo. Archival → FLAC. Broadcast surround → AC-3 / E-AC-3.
2. Where will it be decoded? If your audience plays back on every browser and every phone, AAC is the safest pick because it’s universal. If you control the player (your own app or a desktop client), Opus is the strongest technical choice for both voice and music.
3. What latency do you need? Real-time conversation requires sub-100 ms end-to-end, which forces Opus or G.722. Live broadcast can absorb 2–10 seconds, so AAC works fine. On-demand playback has no latency budget worth worrying about.
4. What bitrate budget do you have? Cellular and satellite links favor low bitrates — Opus at 24 kbps is intelligible for voice; AAC HE-AAC v2 at 32 kbps stereo is acceptable for music. Wi-Fi and wired links rarely need anything below 96 kbps.
5. Are there licensing constraints? Opus, FLAC, and Vorbis are royalty-free. AAC, AC-3, and LDAC require patent licenses for some commercial uses (most modern hardware ships with the licenses paid, but check before redistributing encoders). MP3 is now royalty-free everywhere.
A quick decision matrix:
| Use case | Recommended codec | Bitrate |
|---|---|---|
| Music streaming (general web) | AAC LC | 128–256 kbps |
| Music streaming (own player) | Opus | 96–192 kbps |
| Podcast download | AAC LC | 96 kbps mono |
| Voice / VoIP | Opus | 24–48 kbps |
| Telephony interop | G.711 | 64 kbps |
| Live broadcast (HLS) | AAC LC | 128 kbps |
| Live broadcast (WebRTC) | Opus | 32–64 kbps |
| Surround for OTT | E-AC-3 | 384–768 kbps |
| Lossless archival | FLAC | Variable (~800 kbps) |
| Bluetooth headphones | AAC (Apple) / aptX or LDAC (Android) | Negotiated |
If you’re choosing a codec to pair with a video stream, the same use-case logic applies — see our guides on HEVC vs H.264 and the AV1 codec for the visual side.
Common Audio Codec Issues and Fixes
Two recurring problems show up in support tickets and forums.
“Audio codec not supported” errors. This usually means the playback device can’t decode the codec inside the container. A common case: a video file uses E-AC-3 audio, but the target Android phone only supports AAC. Fix it by transcoding the audio to AAC. Another case: a `.mkv` file with FLAC audio plays in VLC but not in Apple’s QuickTime, because QuickTime doesn’t support that combination — repackage the audio into AAC or use a player with broader codec support.
Audio out of sync with video. Codec latency is the usual cause. If the audio codec adds 100 ms of algorithmic delay (MP3) and the video codec adds none, the audio arrives late. Encoders compensate with negative audio timestamps; if yours doesn’t, switch to a lower-latency audio codec (AAC or Opus) or set an explicit audio offset.
Bitrate spikes and buffering. Variable bitrate (VBR) audio can briefly exceed the average target during loud passages, which causes buffering on slow networks. Switch to constant bitrate (CBR) or constrained VBR for live streams. The same logic applies to adaptive bitrate streaming — keep audio bitrate constant across renditions and let video carry the bandwidth burden.
Audio Codec FAQ
What is an audio codec in simple terms?
An audio codec is a tool — software or hardware — that shrinks audio data so it can be stored or sent quickly, then expands it again so you can hear it. Think of it as a translator: PCM goes in, a smaller compressed file comes out, and on the other end the codec turns that small file back into something a speaker can play.
Is MP3 a codec or a format?
Both. “MP3” names the codec (MPEG-1 Audio Layer III) that compresses the audio, and `.mp3` is also the file format people use to wrap MP3-encoded data. The dual meaning is harmless because MP3 audio is almost always stored in `.mp3` files, but technically the codec and the container are separate concepts.
What audio codec does YouTube use?
YouTube delivers audio in either AAC LC (inside MP4 containers) or Opus (inside WebM containers), depending on which format your client requests. Higher-quality streams (256 kbps+) use Opus; the AAC streams cap at 128–192 kbps.
What is the best Bluetooth audio codec?
LDAC delivers the highest bitrate (up to 990 kbps), but only Sony products and recent Android phones support it. AAC is the default on iPhone. aptX HD and aptX Adaptive are common on Android. LC3 (in LE Audio) will likely become the new default over the next few years because it sounds as good as SBC at half the bitrate.
What’s the difference between an audio codec and a video codec?
An audio codec compresses sound waves; a video codec compresses moving images. They use different algorithms tuned to the perceptual quirks of each medium — audio codecs exploit psychoacoustic masking, video codecs exploit spatial and temporal redundancy. A media file usually contains one of each (e.g., H.264 video + AAC audio in an MP4).
Are FLAC and ALAC the same?
They are functionally similar — both are lossless and produce roughly the same compression ratio — but they are different file formats with different bitstreams. FLAC is open and widely supported; ALAC is Apple’s equivalent and is the format used by Apple Music’s lossless tier. Most modern players read both.
Why does Opus beat MP3 at the same bitrate?
Opus is 25 years newer and combines two algorithms — SILK for speech and CELT for music — that the encoder switches between based on the content. It also uses smarter psychoacoustic modeling and shorter frames. The result is intelligible voice down to 6 kbps and transparent music around 96 kbps, where MP3 needs 128–192 kbps to sound the same.
Can I change a file’s audio codec without losing quality?
Only if you go from a lossless source. Re-encoding lossy audio (MP3 → AAC, for instance) compounds the original quality loss because the second codec discards data the first codec already removed. If you have a FLAC or WAV master, you can produce AAC, MP3, or Opus from it freely. If you only have an MP3, accept that any conversion will sound a little worse than the original.
How many audio channels can a codec support?
It depends on the codec. AAC supports up to 48 main channels plus 16 LFE. Opus supports up to 255 channels plus one LFE. AC-3 caps at 5.1; E-AC-3 at 7.1; AC-4 supports object-based audio (Atmos). Most consumer playback caps at 7.1 because that’s what AV receivers and TVs decode.
Closing Thoughts
An audio codec is doing real work every time you stream a song, take a call, or watch a video — it’s just invisible because it works. The choice between AAC, Opus, MP3, FLAC, and the rest is rarely about which sounds best in the abstract; it’s about which one matches your protocol, your latency budget, your audience’s playback devices, and your bitrate ceiling.
For most teams shipping a streaming product today, the answer is short: AAC for compatibility-first delivery (HLS, downloads, broadcast), Opus for real-time voice and modern web playback, FLAC when bit-perfect quality matters, and AC-3 / E-AC-3 when you need surround.
If you’re building a live streaming or video hosting product and don’t want to wire up audio packaging yourself, Get started with LiveAPI — it ingests AAC over RTMP, SRT, and RTSP, packages live streams to HLS with multi-CDN delivery, and rolls them over to VOD automatically, so your audio just works end to end.


