11/16/12

REC.601/709 and luminance range explained

WOW, the amount of confusion regrading this issue is phenomenal.
In my opinion this topic should have the highest priority for every video shooter, otherwise you'll be shooting yourself in the legs (i.e. ruining your footage - loose dynamic range and get incorrect colors).

Foreword - RGB & YUV

RGB (8bpc) - 3 planes (Red, Green, Blue):
Each pixel has three components (red, green, blue), each component has a value from 0 to 255. The combination of these components produces the final color and luminance of the pixel.

YUV (8bpc) - 3 planes (Y, Cb, Cr):
- Y - Full resolution plane that represents the mean luminance information only.
- U(Cb), V(Cr) - Full resolution, or lower, planes that represent the chroma (color) information only. Absolute point at 128.


Compressed video will be mostly in YUV because of the ability to subsample the chroma - this way saves lots of bandwidth. Subsampling is actually encoding smaller chroma (Cb, Cr) planes, and stretching them back in the conversion to RGB (when displaying). It relies on the fact that our eyes are not so sensitive to color information. Moreover, a definition of luma channel may avoid DC changes while compressing the footage.
Lets take for example raw RGB or YUV4:4:4 (no subsampling) stream: 3 planes ("planes" as the units of measure).
YUV4:2:2 : 2 planes (Full res. Y + half res. Cb + half res. Cr)
YUV4:2:0: 1.5 planes (Full res. Y + quarter res. Cb + quarter res. Cr)

YUV4:4:4/RGB = 1.5 * YUV4:2:2 = 2 * YUV4:2:0 -> We save up to half of the bandwidth with almost no visible loss of information (however, image processing algorithms are sensitive to it).


Compressing luma range

Lets take the Y channel from the YUV and say that instead of using all the 0-255 range we will compress it to 16-235, so we will leave everything under 16 and higher than 235 empty. When converting the frame back to RGB we will stretch it back to the original full range. It shouldn't really concern you why it is done, but you should know that it is done - moreover, it's a standard that comes from the analog world of imaging. You can read regarding the origin of this here.

If you interpret full range as full range or limited as limited, you're on the right way.
In case of limited range as full range:
You're not going to loose details when displaying but you will get incorrect (shifted) picture. However, when re-encoding the footage (transcode) the encoder will probably compress the entire range once again. Same dynamic latitude on less grey levels may cause loss of information and banding (posterization).
It can be understood from this curve:


In case of full range as limited range:
The amount of the details in the shadows that are lost can be clearly seen from the image above. When displaying, the image looks over-contrasted. It can be fixed by compressing luma range or forcing the monitor to show the full range. However, when transcoding the footage, the encoder (that thinks that it is compressed range) might drop everything that is not in the range (crashing). The lost details can't be recovered later.
The curve below demonstrates what happens in this case:
BTW, in curves, any situation of more than one "in" for any "out" value = loss of information.
Of course I must mention the hightlights regression:
In conclusion, when we interpret full luma range as limited range we loose all the "stops" of dynamic range that exists in the grey levels below 16 and above 235.


REC.601 vs. REC.709

RGB <=> YUV conversion has a formula. You can use different multipliers (color matrix) and get (nearly) the same image as long as you use the same multipliers for converting to YUV and converting back to RGB.
REC.601 and REC.709 (and the future REC.2020) are examples of such multipliers.
Video interpreter will convert to RGB by default: a standard definition video using REC.601 coefficients and a high definition video using REC.709 coefficients.
If we take a video that was encoded using REC.601 matrix and decode it using REC.709 matrix, the result will have wrong and shifted colors - especially can be noticed on the red tones, i.e. might screw up the skin tones. For instance:


What does our camera do (emphasis on HDSLR)?

Unfortunately many manufacturers do not obey to standards and even sometimes the embedded metadata does not match the actual parameters.

The most accurate way to know what your camera/DSLR does in aspect of luma range and color matrix is to take a photo of a scene, then take a video of the same scene (same image settings). Overlay the video over the photo and tweak its luma range and color matrix until you get the video identical to the photo. Now you really can be sure that the image is the same as the camera manufacturer meant it to be.

A very popular example to the above is the Canon HDSLR cameras range until the 5DMK3:
Full luma range and REC.601.
This particular case is actually good for us because:
1. As long as we force to decode the footage as REC.601 we shouldn't have color problems.
2. The full range gives us more grey levels to push more dynamic range without ruining the image. This is what the flat image styles intended to do.

How to export the correct image from After Effects?

Once we get the required look of the video in After Effects preview window, we want it to look identically in any player that will be used: Youtube, Vimeo, local MMP, etc.. Sadly, in most of the cases it differs.
So what should be done? (the following explanation is for the case when the color management is off)
In After Effects, add your final composition to new composition (or just add an adjustment layer above everything). Drop the "Color Profile Converter" effect.

Choose (for HD content): HDTV 16-235.


Please note that the image will be shifted in the preview. But it will be constructed properly by the decoder of the media player.

If you turn on color management, do not set compressed luma for project's working space. Compress it only as the final step before exporting.

I hope this post helped,
Mark.

0 comments:

Post a Comment