VRVideo: A flexible pipeline for VR Video Creation

Anthony Dickson¹
Jeremy Shanks¹
Jonathan Ventura²
Alistair Knott³
Stefanie Zollman¹

¹University of Otago, Dunedin, New Zealand
²California Polytechnic State University, San Luis Obispo, CA, USA
³Victoria University of Wellington, Wellington, New Zealand

[Conference Paper] | [Poster Abstract] | [Code]

Abstract

Recent advances in NeRF-based methods have enabled high-fidelity novel view synthesis for video with dynamic elements. However, these methods often require expensive hardware, take days to process a second-long video and do not scale well to longer videos. We create an end-to-end pipeline for creating dynamic 3D video from a monocular video that can be run on consumer hardware in minutes per second of footage, not days. Our pipeline handles the estimation of the camera parameters, depth maps, 3D reconstruction of dynamic foreground and static background elements, and the rendering of the 3D video on a computer or VR headset. We use a state-of-the-art visual transformer model to estimate depth maps which we use to scale COLMAP poses and enable RGB-D fusion with estimated depth data. In our preliminary experiments, we rendered the output in a VR headset and visually compared the method against ground-truth datasets and state-of-the-art NeRF-based methods.

WebXR examples

The WebXR viewer is best experienced in Google Chrome on desktop or on a compatible VR headset. The examples will work on desktop even if the website states "WEBXR NOT AVAILABLE". The scene may take a while to load on machines with less powerful video cards.