An enhancement layer to improve compression efficiency and reduce complexity
LCEVC (Low Complexity Enhancement Video Coding) works by encoding a lower resolution version of a source image using any existing codec (the base codec) and the difference between the reconstructed lower resolution image and the source using a different compression method (the enhancement).
The remaining details that make up the difference with the source are efficiently and rapidly compressed with LCEVC, which uses specific tools designed to compress residual data. The LCEVC enhancement compresses residual information on at least two layers, one at the resolution of the base to correct artefacts caused by the base encoding process and one at the source resolution that adds details to reconstruct the output frames. Between the two reconstructions the picture is upscaled using either a normative upsampler or a custom one specified by the encoder in the bitstream. In addition, LCEVC also performs some non-linear operations called residual prediction, which further improve the reconstruction process preceding residual addition, collectively producing a low-complexity smart content-adaptive (i.e. encoder driven) upscaling.
LCEVC is an enhancement codec, meaning that it does not just upsample well: it will also encode the residual information necessary for true fidelity to the source and compress it (transforming, quantizing and coding it). LCEVC can also produce mathematically lossless reconstructions, meaning all of the information can be encoded and transmitted and the image perfectly reconstructed. Creator’s intent, small text, logos, ads and unpredictable high-resolution details are preserved with LCEVC.
The examples below (magnified to better show the impact on fidelity of small details) illustrate the difference between the source input, the decoded base at half-resolution, the same decoded base upsampled with state-of-the-art Lanczos upsampling and the LCEVC-enhanced full resolution reconstruction. Notice (e.g., from number “8” or letter “R”) that the LCEVC-enhanced reconstruction includes high-frequency details that could not be inferred from just decoding and upsampling the lower resolution encode, however smart and complex the upscaling method.
Figure 1. Source
Figure 2. Half-resolution decoded base
Figure 3. Base upsampled to full resolution with FFmpeg Lanczos
Figure 4. Full resolution LCEVC-enhanced output with typical large-scale distribution bitrates
Thanks to residual sub-layers, LCEVC uniquely combines the world of smart upsampling/super-resolution with the world of traditional coding: for the areas where smart upscaling is enough for high-fidelity reconstruction, LCEVC does not need to transmit residual data; for the areas where smart upscaling fails, LCEVC allows the encoder to efficiently send the corrections that reconstruct fidelity to the source.
The LCEVC tools have been designed to efficiently compress the sparse residual information left by the picture decomposition and recomposition process. In particular, LCEVC’s low-complexity requirement meant that the tool definition process accounted for the availability of hardware acceleration for graphics processing available in existing chipsets and it’s very amenable to optimised implementations (e.g. using SIMD CPU, GPU and heterogeneous parallel processing).
LCEVC adds an additional degree of freedom in implementations that can exploit the bitrate allocation and tool calibration of two separate coding schemes (the base and the enhancement) to produce more efficiencies than the single parts individually. Overall rate control accuracy can also be increased, with consequent benefits for real time low-latency use cases.
The LCEVC specification also enables a stream to signal adaptive dithering post-processing (which reduces banding and aliasing impairments) as well as providing a platform for future enhancement extensions. It allows for the customisation of the upsampling kernels and the inclusion of user data within the bitstream at a transform block level, effectively allowing for the adoption of evolving upsampling/super-resolution techniques and image manipulation within the standard, while still offering an efficient method of encoding residual data and providing up to mathematically lossless image reconstruction.