Built with anycoder — View on Hugging Face
CVPR 2024

3D Multi-frame Fusion for
Video Stabilization

University of Computer Vision, Tech Institute AI Lab
*Equal Contribution

Teaser Video

Demonstrating robust 3D stabilization across dynamic scenes

(Video playback simulated)

Abstract

Video stabilization remains a challenging problem, particularly for handheld footage with large motion and dynamic content. Traditional 2D methods often suffer from distortion and cropping, while existing 3D methods struggle with robustness in complex environments. We present a novel 3D Multi-frame Fusion framework that leverages scene geometry and temporal consistency to generate high-quality stabilized video. By integrating a dense depth prior with a multi-frame optimization strategy, our method effectively decouples camera motion from scene dynamics. We introduce a differentiable warping module that synthesizes full-frame stabilized views by fusing information from adjacent frames, significantly reducing the need for cropping. Extensive experiments on public benchmarks demonstrate that our approach outperforms state-of-the-art methods in both stability and visual quality metrics.

Methodology

Input Frames 3D Motion Estimation Multi-frame Fusion Module Stabilized Output

Our pipeline consists of three main stages: (1) Robust 3D Trajectory Estimation using a learned depth prior to handle dynamic objects; (2) Optimal Path Planning that smooths the camera trajectory while respecting field-of-view constraints; and (3) Neural Multi-frame Rendering, which fuses pixels from temporal neighbors to fill in missing regions caused by stabilization warping, ensuring full-frame output without cropping.

Visual Comparison

Drag the slider to compare Input vs. Our Stabilized Result.

Ours (Stabilized)
Input (Shaky)

Qualitative Results

Walking Scene (Low Light)
Running Scene (Dynamic Objects)
Panning Shot (Parallax)

Citation

@inproceedings{johnson20243dmultiframe,
  title={3D Multi-frame Fusion for Video Stabilization},
  author={Johnson, Alex and Chen, Sarah and Roberts, Michael and Davis, Emily and Zhang, David},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}