Spatial Scalability with AV1: A Comparison between Scalable AV1 and MPEG-5 LCEVC in Video Quality and Complexity


Rising video streaming demands call for scalable delivery methods, enhancing user experiences and reducing bitrate and storage for applications requiring multiple video versions. Spatial scalability in broadcasting eliminates the simulcasting need by serving both HD and UHD users with a single stream. This concept extends to supporting various devices and networks in multi-conferencing, and finds applications in augmented adaptive streaming, scalable video messaging, low latency pixel streaming, and cloud gaming. Moreover, multi-layer encoding enhances user experience by adding features like HDR on an SDR base or bitdepth scalability.

This paper compares two AV1 spatial scalability approaches: Scalable Video Coding AV1 (AV1-SVC) and MPEG-5 Part 2 LCEVC [1] (Low Complexity Enhancement Video Coding) enhancing SVT-AV1 – and aims at exploring the pros and cons of these scalable methods.

Scalable video encoding is known to suffer from greater encoding complexity and efficiency loss compared to Single Layer (SL) encoding. The research compares SL UHD AV1 encodings performed with two base codecs (AMD xAV1 and SVT-AV1 software encoder), along with their respective SL HD renditions (upscaled to UHD), with two scalable HD+ UHD AV1 encodings, i.e., SVC AV1 and LCEVC-enhanced AV1.

Key areas of assessment are:

• Objective quality metrics (VMAF, VMAF_NEG, PSNR),

• Rate-distortion (RD) curves,

• Visual quality assessment, e.g., upscaled base vs. enhanced,

• Encoding complexity.


Download here: