Phys4DGen: A Physics-Driven Framework for Controllable and Efficient 4D Content Generation from a Single Image

Jiajing Lin, Zhenzong Wang, Shu Jiang, Yongjie Hou, Jiang Min*
School of Informatics, Xiamen University

Abstract

The task of 4D content generation involves creating dynamic 3D models that evolve over time in response to specific input conditions, such as images. Existing methods rely heavily on pre-trained video diffusion models to guide 4D content dynamics, but these approaches often fail to capture essential physical principles, as video diffusion models lack a robust understanding of real-world physics. Moreover, these models face challenges in providing finegrained control over dynamics and exhibit high computational costs. In this work, we propose Phys4DGen, a novel, high-efficiency framework that generates physics-compliant 4D content from a single image with enhanced control capabilities. Our approach uniquely integrates physical simulations into the 4D generation pipeline, ensuring adherence to fundamental physical laws. Inspired by the human ability to infer physical properties visually, we introduce a Physical Perception Module (PPM) that discerns the material properties and structural components of the 3D object from the input image, facilitating accurate downstream simulations. Phys4DGen significantly accelerates the 4D generation process by eliminating iterative optimization steps in the dynamics modeling phase. It allows users to intuitively control the movement speed and direction of generated 4D content by adjusting external forces, achieving finely tunable, physically plausible animations. Extensive evaluations show that Phys4DGen outperforms existing methods in both inference speed and physical realism, producing high-quality, controllable 4D content.

Pipeline

The framework includes three phases: 3D Gaussians Generation, Physical Perception, and 4D Dynamics Generation. In 3D Gaussians Generation. In 3D Gaussians Generation stage, from an input image, a static 3D Gaussians will be generated under the guidance of the diffusion model. In Physcial Perception stage, the 3D Gaussians will be segmented into different parts, with corresponding material types and properties assigned to each. In 4D Dynamics Generation stage, we consider each 3D Gaussian kernel as particles within a continuum. Sequentially, we employ MPM to generate dynamics to the static 3D Gaussians. Meanwhile, users can guide the MPM simulator to generate 4D content that aligns with their desired outcomes by adjusting the external forces.

Visual Results

We showcase the various visual results generated by Phys4DGen


A balloon tied to a wooden block sways in the wind.

A orange rose sways in the wind.

A snowman is melting

A yellow toy duck being pressed

BibTeX


      @article{lin2024phys4dgen,
      title={Phys4DGen: A Physics-Driven Framework for Controllable and Efficient 4D Content Generation from a Single Image},
      author={Lin, Jiajing and Wang, Zhenzhong and Jiang, Shu and Hou, Yongjie and Jiang, Min},
      journal={arXiv preprint arXiv:2411.16800},
      year={2024}
}