- [2024.04] I am honored to receive the Stanford Graduate Fellowship Award.
- [2024.03] I will be joining Stanford University as a CS PhD student.
* indicates equal contributions, † indicates equal advising.
|
The Scene Language: Representing Scenes with Programs, Words, and Embeddings
Yunzhi Zhang,
Zizhang Li,
Matt Zhou,
Shangzhe Wu,
Jiajun Wu
CVPR, 2025,
Highlight
Project page /
arXiv /
Code
The Scene Language is a visual scene representation that concisely and precisely describes the structure, semantics, and identity of visual scenes.
It represents a scene with three key components: a program that specifies the hierarchical and relational structure of entities in the scene,
words in natural language that summarize the semantic class of each entity, and embeddings that capture the visual identity of each entity.
|
|
Learning the 3D Fauna of the Web
Zizhang Li*,
Dor Litvak*,
Ruining Li,
Yunzhi Zhang,
Tomas Jakab,
Christian Rupprecht,
Shangzhe Wu†,
Andrea Vedaldi†,
Jiajun Wu†
CVPR, 2024
Project page /
arXiv /
Code /
Video /
Demo
3D-Fauna learns a pan-category deformable 3D model of more than 100 different animal species using only 2D Internet images as training data, without any prior shape models or keypoint annotations. At test time, the model can turn a single image of an quadruped instance into an articulated, textured 3D mesh in a feed-forward manner, ready for animation and rendering.
|
|
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
Kyle Sargent,
Zizhang Li,
Tanmay Shah,
Charles Herrmann,
Hong-Xing Yu,
Yunzhi Zhang,
Eric Ryan Chan,
Dmitry Lagun,
Li Fei-Fei,
Deqing Sun,
Jiajun Wu
CVPR, 2024
Project page /
arXiv /
code
We train a 3D-aware diffusion model, ZeroNVS on a mixture of scene data sources that capture object-centric, indoor, and outdoor scenes.
This enables zero-shot SDS distillation of 360-degree NeRF scenes from a single image.
Our model sets a new state-of-the-art result in LPIPS on the DTU dataset in the zero-shot setting.
We also use the MipNeRF-360 dataset as a benchmark for single-image NVS.
|
|
Learning Part Segmentation through Unsupervised Domain Adaptation from Synthetic Vehicles
Qing Liu,
Adam Kortylewski,
Zhishuai Zhang,
Zizhang Li,
Mengqi Guo,
Qihao Liu,
Xiaoding Yuan,
Jiteng Mu,
Weichao Qiu,
Alan Yuille
CVPR, 2022,
oral
arXiv /
code
We construct a synthetic multi-part dataset with different categories of objects,
evaluate different part segmentation UDA methods with this benchmark, and also provide an improved baseline.
|
- Reviewer of 3DV, AAAI, BMVC, CAI, CVPR, ECCV, ICCV, ICLR, ICML, NeurIPS.
- OpenReview Chair of 3DV2025.
|