CVPR 2025

The conference

Nashville : better than expected
2 days of workshops/tutorials/seminars
3 days of posters/orals/keynotes
Many people
Trends: Foundation/Multimodal/Video models, diffusion models, Mamba
Downsides: political context, posters on sunday

Workshops

Equivariant Vision: From Theory to Practice (video; recommended talk: Vincent Sitzmann at 3:12:00)
Shape analysis
Computer vision in sports (video; recommended talk: Roland Memisevic at 10:51)
Also: Meta Keynote (video: “cycle” of training a “Frontier” LLM)

Paper pattern

Pick an “unsolved” “2D images -> something” task

Paper pattern

Pick an “unsolved” “2D images -> something” task
Generate/collect/build a dataset of ground truth (doesn’t have to be too big) -> actually quite hard

Paper pattern

Pick an “unsolved” “2D images -> something” task
Generate/collect/build a dataset of ground truth (doesn’t have to be too big) -> actually quite hard
Use “foundation” features (e.g. Dino) as input of your model solution

Paper pattern

Pick an “unsolved” “2D images -> something” task
Generate/collect/build a dataset of ground truth (can be synthetic, doesn’t have to be too big) -> actually quite hard
Use “foundation” features (e.g. Dino) as input of your model solution

Even if your dataset is far from real data or is small scale, your solution will generalize a lot

Exemples

Facial analysis (Büchner et al. 2025)

Exemples

Facial analysis (Büchner et al. 2025)
3D hair reconstruction (Rosu et al. 2025)

Exemples

Facial analysis (Büchner et al. 2025)
3D hair reconstruction (Rosu et al. 2025)
2D -> 3D point cloud (best paper) (Wang et al. 2025)

Exemples

Facial analysis (Büchner et al. 2025)
3D hair reconstruction (Rosu et al. 2025)
2D -> 3D point cloud (best paper) (Wang et al. 2025)
2D -> 3D object primitives (Zhao et al. 2025)

Exemples

Facial analysis (Büchner et al. 2025)
3D hair reconstruction (Rosu et al. 2025)
2D -> 3D point cloud (best paper) (Wang et al. 2025)
2D -> 3D object primitives (Zhao et al. 2025)
Removing reflections from images (Kee et al. 2025)

Exemples

Facial analysis (Büchner et al. 2025)
3D hair reconstruction (Rosu et al. 2025)
2D -> 3D point cloud (best paper) (Wang et al. 2025)
2D -> 3D object primitives (Zhao et al. 2025)
Removing reflections from images (Kee et al. 2025)

And probably much more!

Büchner, Tim, Christoph Anders, Orlando Guntinas-Lichius, and Joachim Denzler. 2025. “Electromyography-Informed Facial Expression Reconstruction for Physiological-Based Synthesis and Analysis.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 215–27.

Kee, Eric, Adam Pikielny, Kevin Blackburn-Matzen, and Marc Levoy. 2025. “Removing Reflections from Raw Photos.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 161–71.

Rosu, Radu Alexandru, Keyu Wu, Yao Feng, Youyi Zheng, and Michael J Black. 2025. “DiffLocks: Generating 3D Hair from a Single Image Using Diffusion Models.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 10847–57.

Wang, Jianyuan, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. 2025. “Vggt: Visual Geometry Grounded Transformer.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 5294–5306.

Zhao, Wang, Yan-Pei Cao, Jiale Xu, Yuejiang Dong, and Ying Shan. 2025. “Di-Pcg: Diffusion-Based Efficient Inverse Procedural Content Generation for High-Quality 3d Asset Creation.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 11061–72.