Peng Wang

Senior Staff Research Scientist (Tech Lead) at Bytedance, USA

I obtained PhD from University of California, Los Angeles under the supervision of Prof. Alan Yuille, and received M.S/B.S from Peking University . My research interests lie in the intersection of Computer Vision and Machine Learning, such as learning 3D representations, learning neural architectures, and mining informative instances from large image/video datasets. I upload some research thoughts about how to learn visual system in an unsupervised manner. During my study, I colaborated with many well recognized researchers, such as Dr. Kevin Murphy and Dr. Sergio Guadarrama. at Google Research. Dr. Xiaohui Shen, Dr. Zhe Lin, Dr. Scott Cohen and Dr. Brian Price at Adobe Research, and Dr. Wei Xu, Dr. Ruigang Yang at Baidu. Dr. Jingdong Wang at MSRA.

At Bytedance, we delivered techs to lots of Product such as Doubao, 剪映 / CapCut, 抖音/TikTok .

pengwangpku2012 [at] gmail [dot] com
Google Scholar
Github
Linkedin

News

[Mar, 2025] SeedEdit v1.6 launched to Doubao and Jimeng, Better than Gemini Flash in Editing HQ images
[Jan, 2025] two papers to ICLR 2025 . two to CVPR 2025
[May, 2024] three papers to CVPR 2024 . one to ICLR 2024
[Oct, 2023] one papers to 3DV 2024 . ImageDream code released.
[Oct, 2023] one papers to NeurIPS 2023 . MVDream diffusion and recon code released.
[Feb, 2023] one papers to ICLR 2023 . and one to CVPR 2023
[Jun, 2022] one papers to CVPR 2022 . and one paper to ECCV 2022

[Feb, 2022] three product features delivered to Douyin, Jianying .
[Feb, 2021] one papers to CVPR 2021 , one paper to ACCV 2020 .
[Feb, 2020] one papers to ICRA 2020 , one paper to CVPR 2020 .
[Jul, 2019] Tree papers to TPAMI 2019 .
[Jul, 2019] Two papers to AAAI 2020 .
[Jul, 2019] Tree papers to TPAMI 2019 and one to BMVC 2019.
[Feb, 2019] Two papers are accepted to CVPR 2019.
[Jun, 2018] We are holding an Autonomous Driving Workshop at ECCV 2018/CVPR 2019.
[May, 2018] One ECCV 2018 (No.1 at KITTI stereo) & BMVC paper accepted.
[Feb, 2018] Five paper to CVPR 2018.
[Jul, 2017] One paper to AAAI 2018.
[Jul, 2017] One paper to CVPR 2017 , dataset of PASCAL human part and keypoints is released .

Products

Selected products, where I lead the effort in algorithm design, development and implementation (linked to some demo videos)

SeedEdit [Increased >10M Active Users across all our Apps]

3D photo zoom effect [3m DAU increase, help capcut rank #1 in ios free app]

Cyberpunk photo effect [Top effect in multi-countries e.g. JP/KR]

AR City Effect [5m Users in Douyin(China Tiktok)] Public Tech Blog

Virtual Object AR Attachment [1st 3D virtual effects in TT]

SkyAR Effect

Papers

Representative papers are highlighted, + indicates Team Leader, * indicates equal contribution

2025

Dual Diffusion for Unified Image Generation and Understanding
Zijie Li*, Henry Li*, Yichun Shi, Amir Barati Farimani, Yuval Kluger, Linjie Yang, Peng Wang⁺
CVPR, 2025. project

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model
Xiaoding Yuan, Shitao Tang, Kejie Li, Alan Yuille, Peng Wang⁺
CVPR, 2025.

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Mude Hui*, Siwei Yang*, Bingchen Zhao, Yichun Shi, Heng Wang, Peng Wang⁺, Yuyin Zhou, Cihang Xie,
ICLR, 2025. project

Autoregressive Pretraining with Mamba in Vision
Sucheng Ren, Xianhang Li, Haoqin Tu, Feng Wang, Fangxun Shu, Lei Zhang, Jieru Mei, Linjie Yang, Peng Wang, Heng Wang, Alan Yuille, Cihang Xie
ICLR, 2025.

2024

ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation
Peng Wang, Yichun Shi
Arxiv, 2024. project

Consistent-1-to-3: Consistent Image to 3D View Synthesis via Geometry-aware Diffusion
Jianglong Ye, Peng Wang, Kejie Li, Yichun Shi, Heng Wang
3DV, 2024. project
MVDream: Multi-View Diffusion for 3D Generation
Yichun Shi, Peng Wang, Jianglong Ye, Long Mai, Kejie Li, Xiao Yang
ICLR, 2024. project
Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences,
Seungwook Kim, Kejie Li, Xueqing Deng, Yichun Shi, Minsu Cho, Peng Wang
CVPR, 2024. project

2023

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion
Shitao Tang, Fuyang Zhang, Jiacheng Chen, Peng Wang, Yasutaka Furukawa
NeurIPS, (Spotlight) 2023. code
VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis
Angtian Wang, Peng Wang, Jian Sun, Adam Kortylewski, Alan Yuille
ICLR, 2023. code
Selective Feature Adapter for Dense Vision Transformers
Xueqing Deng, Qi Fan, Xiaojie Jin, Linjie Yang, Peng Wang⁺
Arxiv, 2023.
Multimodal Video Adapter for Parameter Efficient Video Text Retrieval
Bowen Zhang, Xiaojie Jin, Weibo Gong, Kai Xu, Zhao Zhang,Peng Wang, Xiaohui Shen, Jiashi Feng
CVPR, 2024.

2022

DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization
Dawei Sun^*, Xueqing Deng^*, Shawn Newsam, Peng Wang⁺
ECCV, 2022.
NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night
Xueqing Deng, Peng Wang, Xiaochen Lian, Shawn Newsam
CVPR, 2022. code

2021

HR-NAS: Searching Efficient High-Resolution Neural Architectureswith Lightweight Transformers
Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo
CVPR, 2021. (Oral) code

2020

3D Part Guided Image Editing for Fine-grained Object Understanding
Zongdai Liu, Feixiang Lu, Peng Wang, Hui Miao, Liangjun Zhang, Ruigang Yang, Bin Zhou
CVPR, 2020.
Omnidirectional Depth Extension Networks
Xinjing Chen, Peng Wang⁺, Yanqi Zhou, Chenye Guan, Ruigang Yang
ICRA, 2020.
CSPN++: Learning Context and Resource Aware Convolutional Spatial Propagation Networks for Depth Completion
Xinjing Chen, Peng Wang⁺, Chenye Guan, Ruigang Yang
AAAI, 2020.
AutoRemover: Automatic Object Removal for Autonomous Driving Videos
Rong Zhang, Wei Li⁺, Peng Wang⁺, Chenye Guan, Jin Fang， Yuhang Song， Jinhui Yu， Baoquan Chen， Weiwei Xu⁺， Ruigang Yang
AAAI, 2020.
Speech2Video Synthesis with 3D Skeleton Regularization and Expressive Body Poses
Miao Liao, Sibo Zhang, Peng Wang, Hao Zhu, Xinxin Zuo, Ruigang Yang
ACCV, 2020.

2019

Learning Depth with Convolutional Spatial Propagation Network
Xinjing Chen, Peng Wang⁺, Ruigang Yang
TPAMI, 2019.
Every Pixel Counts++: Joint Learning of Geometry and Motion with 3D Holistic Understanding
Chenxu Luo*, Zhenheng Yang*, Peng Wang*⁺, Yang Wang, Wei Xu, Ram Nevatia, Alan Yuille
TPAMI, 2019.
The ApolloScape Open Dataset for Autonomous Driving and its Applicationg
Xinyu Huang*, Peng Wang*, Xinjing Cheng, Dingfu Zhou, Qichuan Geng, Ruigang Yang
TPAMI, 2019.
EPNAS: Efficient Progressive Neural Architecture Search
Yanqi Zhou, Peng Wang, Sercan Arik, Haonan Yu, Syed Zawad, Feng Yan, Greg Diamos
BMVC, 2019.
UnOS: Unified Unsupervised Optical-flow and Stereo-depth Estimation by Watching Videos
Yang Wang, Peng Wang, Zhenheng Yang, Chenxu Luo, Yi Yang, Wei Xu
CVPR, 2019. code
ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving
Xibin Song, Peng Wang, Dingfu Zhou, Rui Zhu, Chenye Guan, Yunchao Dai, Hao Su, Hongdong Li, Ruigang Yang
CVPR, 2019. data

2018

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding
Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ramakant Nevatia
ECCV (Workshop of VNAD ), 2018 code (to be update)
Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network
Xinjing Chen*, Peng Wang*, Ruigang Yang
ECCV, 2018. code
SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting
Yuhang Song, Chao Yang, Yeji Shen, Peng Wang, Qin Huang, C.-C. Jay Kuo
BMVC, 2018
The ApolloScape Dataset for Autonomous Driving
Xinyu Huang, Xinjing Cheng, Qichuan Geng; Binbin Cao, Dingfu Zhou, Peng Wang, Yuanqing Lin, Yang Ruigang
CVPR (Workshop of Autonomous Driving), 2018 code, project page,
DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map
Peng Wang, Ruigang Yang, Binbin, Cao, Wei Xu, Yuanqing Lin
CVPR, 2018 code
LEGO: Learning Edge with Geometry all at Once by Watching Videos
Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ramakant Nevatia
CVPR, 2018 (spotlight oral) code
View Extrapolation of Human Body from a Single Image
Hao Zhu, Hao Su, Peng Wang, Ruigang Yang
CVPR, 2018
Occlusion Aware Unsupervised Learning of Optical Flow
Yang Wang, Yi Yang, Zhenheng Yang, Liang Zhao, Peng Wang, Wei Xu
CVPR, 2018
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
Liang-Chieh Chen, Alexander Hermans, George Papandreou, Florian Schroff, Peng Wang, Hartwig Adam
CVPR, 2018
Unsupervised Geometry Estimation with Edge-aware Depth-Normal Consistency
Zhenheng Yang, Peng Wang, Wei Xu, Liang Zhao, Ramakant Nevatia
AAAI, 2018 (Oral)

2017

Joint Multi-Person Pose Estimation and Semantic Part Segmentation in a Single Image
Fangting Xia, Peng Wang, Alan Yuille,
CVPR, 2017 data,

2016

SURGE: Surface Regularized Geometric Estimation from a Single Image
Peng Wang, Xiaohui Shen, Bryan Russel, Scott Cohen, Brian Price, Alan Yuille
NIPS, 2016 supplimentary, video,
Zoom Better to See Clearer: Human and Object Part Segmentation with Auto Zoom Net
Fangting Xia, Peng Wang, Liang-Chieh Chen, Alan Yuille
ECCV, 2016
DOC: Deep OCclusion Recovering From A Single Image
Peng Wang, Alan Yuille
ECCV, 2016 code, data(pascal in detail)
Pose-Guided Human Parsing with Deep Learned Features
Fangting Xia, Peng Wang*, Jun Zhu* , Alan Yuille,
AAAI, 2016 (Oral)

2015

Joint Object and Part Segmentation using Deep Learned Potentials
Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan Yuille
ICCV, 2015 supplimentary, horse_cow_data, Pascal_animal_trainval_list, Part Challenge
Towards Unified Depth and Semantic Prediction from a Single Image
Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan Yuille,
CVPR, 2015 application video, depth results on NYU v2 test
Learning a Photo Cropping Cascade
Peng Wang, Zhe Lin, Radomir Mech
WACV, 2015 supplimentary, dataset (crop label), dataset images test images
Error Factor Analysis for Wild-scene Image Labelling
Peng Wang, Alan Yuille
WACV, 2015 code with need materials,

Before 2015

Supervised Kernel Descriptor for Visual Recognition
Peng Wang, Jingdong Wang, Gang Zeng, Weiwei Xu, Hongbin Zha, Shipeng Li
CVPR, 2013
Salient Object Detection for Searched Web Images via Global Saliency
Peng Wang, Jingdong Wang, Gang Zeng, Jie Feng, Hongbin Zha, Shipeng Li
CVPR, 2012 web saliency data,
Contextual Dominant Color Name Extraction for Web Image Search
Peng Wang, Dongqing Zhang, Gang Zeng, Jingdong Wang
ICME (Workshop of Structure-sensitive Superpixels via Geodesic Distance), 2012
Color Filter for Image Search
Peng Wang, Dongqing Zhang, Jingdong Wang, Zhong Wu, Xian-Sheng Hua, Shipeng Li
ACM Multi Media(MM), Demo Abstract, 2012
Structure-sensitive Superpixels via Geodesic Distance
Peng Wang, Gang Zeng, Rui Gan, Jingdong Wang, Hongbin Zha
IJCV, 2013
Structure-sensitive Superpixels via Geodesic Distance
Gang Zeng, Peng Wang, Jingdong Wang, Rui Gan, Hongbin Zha
ICCV, 2011

Working Experience

Staff Research Scientist, Bytedance, 2019 -
Senior/Staff Research Scientist, Baidu, 2017 - 2019
Intern, Google Research, Summer 2016
Intern, Adobe Research, Summer 2014, Spring, 2016
Research Assistant, UCLA, 2013 - 2017
Intern, Microsoft Research Asia, Summer - Winter 2012

Teaching & Research Activities

Chair organizer of workshop on ApolloScape: 3D Understanding for Autonomous Driving, ECCV 2018
Staff member of workshop on Autonomous Driving, CVPR 2018
Staff member of workshop on PASCAL in details, CVPR 2017
Teaching assistant of Stochastic Processes at PKU
Teaching assistant of STAT 261C at UCLA