4D content generation aims to create dynamically evolving 3D content that responds to specific input objects such as images or 3D representations. Current approaches typically incorporate physical priors to animate 3D representations, but these methods suffer from significant limitations: they not only require users lacking physics expertise to manually specify material properties but also struggle to effectively handle the generation of multi-material composite objects. To address these challenges, we propose Phys4DGen, a novel 4D generation framework that integrates multi-material composition perception with physical simulation. The framework achieves automated, physically plausible 4D generation through three innovative modules: first, the 3D Material Grouping module partitions heterogeneous material regions on 3D representations surfaces via semantic segmentation; second, the Internal Physical Structure Discovery module constructs the mechanical structure of object interiors; finally, we distill physical prior knowledge from multimodal large language models to enable rapid and automatic material properties identification for both objects’ surfaces and interiors. Experiments on both synthetic and real-world datasets demonstrate that Phys4DGen can generate high-fidelity 4D content with physical realism in open-world scenarios, significantly outperforming state-of-the-art methods.
We visualize the perceptual effects of Phys4DGen.
We provide the rendered videos of 4D content generated from images.
We provide rendered videos of 4D content generated from real-world static scenes in PhysDreamer, including alocasia, carnations, telephone, and hat.
@article{lin2024phys4dgen,
title={Phys4DGen: A Physics-Driven Framework for Controllable and Efficient 4D Content Generation from a Single Image},
author={Lin, Jiajing and Wang, Zhenzhong and Jiang, Shu and Hou, Yongjie and Jiang, Min},
journal={arXiv preprint arXiv:2411.16800},
year={2024}
}