Last Updated:

How Does YouTube Compress Videos

Arsen Karapetyan

If you've ever asked yourself "How does YouTube compress videos and why is it so effective?" - read on! YouTube can stream videos using a technology called Adaptive Bit Rate or ABR . This is a technique that adjusts the video bit rate on-the-fly to produce an uninterrupted video experience. For instance, if your device has poor network connectivity, YouTube will automatically choose a lower bit rate and provide a better watch experience for you.

In order to make this possible, YouTube uses six encoding quality tiers: 144p (150 Kbit/s), 240p (400 Kbit/s), 360p (600 Kbit/s), 720p HD (2 Mbit/s), 1080p HD (6 Mbit/s) or 4K UHD. The highest tier has the best audio and video quality but it also requires more bandwidth from your Internet Service Provider. And if you use a slow Internet connection (dial-up or DSL), YouTube will downscale the videos to an even lower quality. For each tier, YouTube also offers a way to "pump up" the video bit rate by using less efficient encoding methods.

YouTube usually recommends that users set up their account to play at its highest quality for optimal performance and video quality, but it has disabled this setting by default on many newer devices. This includes all of Google's ChromeOS laptops, Chromebooks, Chromecast, and Android phone and tablets running Chromecast built-in. And most new mobile devices that support HTML5 video playback such as iPhones or iPads ship with this setting turned off by default.

How does youtube compress videos effectively?

To understand how this works, let's start with the basics of how video compression works. If you have ever watched a video with a media player, such as VLC or Windows Media Player, then you've probably noticed the video has one or more audio and visual frames per second. We're also used to seeing a black or white box on the screen that moves maddeningly fast through frames. This is called the "visual envelope" and represents the change from one frame to another. The visual envelope contains all of our moving images, so it's critical that it is stable and not slowed down by slow updates [1].

The widely used H.264 video codec divides the visual envelope into a grid of 16x16 pixel blocks, and uses motion compensation prediction to find the best way to compress each image block. Usually, H.264 uses a "predictive" or "intra" mode to produce video frames:

Predictive mode is used to encode inter-frame content, i.e. it looks at the pixels in previous frames or other images on the screen and encodes them as accurately as possible into future images. In this case, we have access to previously encoded data and use it as a starting point for further compression (video encoding). This is similar to the process of compressing audio files used in MP3 and AAC implementations. In predictive mode, we don't need to encode all of the images because the encoder already knows what will appear in the next picture. This saves time (there are usually fewer frames to encode) and makes more room for other types of compression.

Intra-frame encoding is used to compress video when we only have access to a single frame (just like in audio encoding). In this case, we have less information than other types of video compression, so we need to make some decisions about how best to encode each frame. Intra-frame compression requires more space than predictive encoding, but it can produce better results because it is based on a single image.

Most video codecs try to reduce the number of bits needed to encode a video frame by comparing each block to the ones that surround it. This helps reduce "blocking" and "ringing" effects (in which pixels in a frame appear jagged or poorly defined). It also reduces the number of bits needed to encode an image, so you can store less information in each frame, allowing more room for other types of compression.

The original research on YouTube's implementation was published in [2], but since then its ABR algorithm has changed significantly and is now based on an H.264 adaptive bit rate algorithm. This ABR algorithm works by observing the user's "playback quality" parameter and changing the video bit rate to provide an optimal viewing experience. For example, if you set the playback quality to high, YouTube reduces the number of pixels that appear in a frame to encode a video. Since there are fewer pixels, YouTube can also reduce the number of bits needed to encode them. This is also similar to how YouTube compresses audio files (MP3 or AAC).

The ABR algorithm is a proprietary Google implementation of the H.264 codec, and it is very good at changing the amount of data in each frame depending on network conditions. The algorithm adjusts the video playback quality by switching between different standard bit rates, adjusting image quality in each frame to produce a smooth video playback experience. This adjustment can happen at any time based on network conditions, so it would be impossible for an attacker to predict when they could inject malicious content into YouTube's servers.

How does ABR work?

When you bring up YouTube's website or click on its icon from your mobile device, you are suddenly transferring hundreds or thousands of bits per second (or more) from across the world. YouTube has a several ways to send you different streams of pixels, but it's likely that the bits are sorted into packets that travel across the Internet at about 6 Mbps (the maximum speed for a wired Internet connection).

As these bits travel across the Internet, they are encoded and decoded by both Google's servers and yours. YouTube movies are downloaded, converted into HTML5 video using VLC or Handbrake, and displayed in a MP4 container. Each frame is compressed into a JPEG image. The PNG uses an adaptive-resizing algorithm to scale down the image dimensions to match YouTube's target resolution (320x240). This is all done before YouTube puts it online at its servers. The original image is called a "base layer", and the scaling algorithm is called adaptive resizing. The top layer of the PNG is an animated GIF that loops continuously, making it appear as if the frames are moving. YouTube uses a simple PNG to make up each frame of video in its movies. Each frame consists of a few componenets:

A JPEG image, which contains the base layer. This makes up most of the video's pixels.

The "playback quality" parameter, which indicates how many pixels should display in each frame (see below).

An animated GIF, which displays a single looping frame to make it appear as if YouTube's movies are moving.

The playback quality parameter is a small value that varies from 0 (low quality) to 10 (high quality). When you play a video at, this tells YouTube how many bits it can use to encode each frame. The higher the number of bits used, the better the image quality. So all of your computer's memory and bandwidth are used up by this process. When the bit rate is low, the playback quality is high; when it's high, it's low; on average, you want this number to be somewhere in between.

When you download a YouTube video, the playback quality parameter is not included. It must be computed on the fly. This is done by first making a few assumptions about the source. It is assumed that the video was encoded using Google's predefined bit rates and quality modes, and that it uses zero-compression (where zero bits are used to display each pixel). The process then goes through every available representative frame of the video, computes its average bit rate (in bits per second), and encodes each frame into an approximation of that target rate using adaptive resizing. This is an enormous task; YouTube uses 250 terabytes of storage for videos with no audio.

Because of the computational complexity involved, many computers have trouble playing videos with low bit rates. When YouTube runs out of memory, it usually plays those videos at a lower quality to begin with, but in some cases it may not be able to do so and you'll see the dreaded "not enough resources" message.

When designing a high-quality video, consider using a combination of fast playback and high bit rate. I'm not saying that the video won't look good on an old PC; it will (after all, why not run the test on your MacBook?), but you'll need to have enough resources available for playback.

The quick fix: Make sure you don't have any software or other processes running that will try to use up a lot of memory. Close all programs and try the test again. If it does, then you know which (if any) processes are using up memory.