“MPEG Immersive Video Coding Standard”, Proceedings of the IEEE 2021

by Jill M. Boyce, Renaud Doré, Adrian Dziembowski, Julien Fleureau, Joel Jung, Bart Kroon, Basel Salahieh, Vinod Kumar Malamal Vadakital, Lu Yu

Abstract: This article introduces the ISO/IEC MPEG Immersive Video (MIV) standard, MPEG-I Part 12, which is undergoing standardization. The draft MIV standard provides support for viewing immersive volumetric content captured by multiple cameras with six degrees of freedom (6DoF) within a viewing space that is determined by the camera arrangement in the capture rig. The bitstream format and decoding processes of the draft specification along with aspects of the Test Model for Immersive Video (TMIV) reference software encoder, decoder, and renderer are described. The use cases, test conditions, quality assessment methods, and experimental results are provided. In the TMIV, multiple texture and geometry views are coded as atlases of patches using a legacy 2-D video codec, while optimizing for bitrate, pixel rate, and quality. The design of the bitstream format and decoder is based on the visual volumetric video-based coding (V3C) and video-based point cloud compression (V-PCC) standard, MPEG-I Part 5.

Direct download link

DOI: 10.1109/JPROC.2021.3062590

“Video Based Coding of Volumetric Data”, IEEE International Conference on Image Processing (ICIP), October 2020

by Danillo B. Graziosi & Bart Kroon

Abstract: New standards are emerging for the coding of volumetric 3D data such as immersive video and point clouds. Some of these volumetric encoders similarly utilize video codecs as the core of their compression approach, but apply different techniques to convert volumetric 3D data into 2D content for subsequent 2D video compression. Currently in MPEG there are two activities that follow this paradigm: ISO/IEC 23090-5 Video-based Point Cloud Compression (V-PCC) and ISO/IEC 23090-12 MPEG Immersive Video (MIV). In this article we propose for both standards to define 2D projection as common transmission format. We then describe a procedure based on camera projections that is applicable to both standards to convert 3D information into 2D images for efficient 2D compression. Results show that our approach successfully encodes both point clouds and immersive video content with the same performance as the current test models that MPEG experts developed separately for the respective standards. We conclude the article by discussing further integration steps and future directions.

DOI: 10.1109/ICIP40778.2020.9190689

“An Immersive Video Experience with Real-Time View Synthesis Leveraging the Upcoming MIV Distribution Standard”, International Conference on Multimedia & Expo Workshops, July 2020

by Julien Fleureau, Bertrand Chupeau, Franck Thudor, Gérard Briand, Thierry Tapie, Renaud Doré

Abstract: The upcoming MPEG Immersive Video (MIV) standard will enable storage and distribution of immersive video content over existing and future networks, for playback with 6 full or partial degrees of freedom of view position and orientation. The demo showcases a VOD server streaming MIV encoded immersive video contents up to a decoding client where an HMD is connected. The user can perceive the parallax as he moves his head (translation, rotation) when seated. The core contribution of the demo is a real-time GPU implementation of the virtual view synthesis at decoder side. An enhanced quality of rendered views is attained through a weighting strategy of source views contributions which leverages standardized metadata conveying information on the pruning decisions at encoder side.

DOI: 10.1109/ICMEW46912.2020.9105948

“Understanding MPEG-I Coding Standardization in Immersive VR/AR Applications”, SMPTE motion imaging journal, November 2019

by Gauthier Lafruit, Daniele Bonatto, Christian Tulvan, Marius Preda, Lu Yu

Vol.128, No. 10, pp. 33-39, ISSN:1545-0279.

Abstract: After decennia of developing leading-edge 2D video compression technologies, the Moving Picture Expert Group (MPEG) is currently working on the new era of coding for immersive applications, referred to as MPEG-I, where “I” refers to the “Immersive” aspects. It ranges from 360° video with head-mounted displays to free navigation in 3D space with head-mounted and 3D light field displays. Two families of coding approaches, covering typical industrial workflows, are currently considered for standardization—MultiView + Depth (MVD) Video Coding and Point Cloud Coding—both supporting high-quality rendering at bitrates of up to a couple of hundreds of megabits per second. This paper provides a technical/historical overview of the acquisition, coding, and rendering technologies considered in the MPEG-I standardization activities.

DOI: 10.5594/JMI.2019.2941362

Standardisation Coding on Immersive Video Coding, IEEE journal on Emerging & Selected Topics in Circuits and Systems”, February 2019

by Matthias Wien, Jill Boyce, Thomas Stockhammer, Wen-Hsiao Peng

Abstract: Based on increasing availability of capture and display devices dedicated to immersive media, coding and transmission of these media has recently become a highest-priority subject of standardization. Different levels of immersiveness are defined with respect to an increasing degree of freedom in terms of movements of the observer within the immersive media scene. The level ranges from three degrees of freedom (3DoF) allowing the user to look around in all directions from a fixed point of view to six degrees of freedom (6DoF), where the user can freely alter the viewpoint within the immersive media scene. The Moving Pictures Experts Group (MPEG) of ISO/IEC is developing a standards suite on “Coded Representation of Immersive Media,” called MPEG-I, to provide technical solutions for building blocks of the media transmission chain, ranging from architecture, systems tools, coding of video and audio signals, to point clouds and timed text. In this paper, an overview on recent and ongoing standardization efforts in this area is presented. While some specifications, such as High Efficiency Video Coding (HEVC) or version 1 of the Omnidirectional Media Format (OMAF), are already available, other activities are under development or in the exploration phase. This paper addresses the status of these efforts with a focus on video signals, indicates the development timelines, summarizes the main technical details and provides pointers to further points of reference.

DOI: 10.1109/JETCAS.2019/2898948