CVPR 2025

The conference

  • Nashville : better than expected
  • 2 days of workshops/tutorials/seminars
  • 3 days of posters/orals/keynotes
  • Many people
  • Trends: Foundation/Multimodal/Video models, diffusion models, Mamba
  • Downsides: political context, posters on sunday

Workshops

  • Equivariant Vision: From Theory to Practice (video; recommended talk: Vincent Sitzmann at 3:12:00)
  • Shape analysis
  • Computer vision in sports (video; recommended talk: Roland Memisevic at 10:51)
  • Also: Meta Keynote (video: “cycle” of training a “Frontier” LLM)

Paper pattern

  • Pick an “unsolved” “2D images -> something” task

Paper pattern

  • Pick an “unsolved” “2D images -> something” task
  • Generate/collect/build a dataset of ground truth (doesn’t have to be too big) -> actually quite hard

Paper pattern

  • Pick an “unsolved” “2D images -> something” task
  • Generate/collect/build a dataset of ground truth (doesn’t have to be too big) -> actually quite hard
  • Use “foundation” features (e.g. Dino) as input of your model solution

Paper pattern

  • Pick an “unsolved” “2D images -> something” task
  • Generate/collect/build a dataset of ground truth (can be synthetic, doesn’t have to be too big) -> actually quite hard
  • Use “foundation” features (e.g. Dino) as input of your model solution

Even if your dataset is far from real data or is small scale, your solution will generalize a lot

Exemples

 

Exemples

 

Exemples

 

Exemples

 

Exemples

 

Exemples

And probably much more!

 
Büchner, Tim, Christoph Anders, Orlando Guntinas-Lichius, and Joachim Denzler. 2025. “Electromyography-Informed Facial Expression Reconstruction for Physiological-Based Synthesis and Analysis.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 215–27.
Kee, Eric, Adam Pikielny, Kevin Blackburn-Matzen, and Marc Levoy. 2025. “Removing Reflections from Raw Photos.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 161–71.
Rosu, Radu Alexandru, Keyu Wu, Yao Feng, Youyi Zheng, and Michael J Black. 2025. “DiffLocks: Generating 3D Hair from a Single Image Using Diffusion Models.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 10847–57.
Wang, Jianyuan, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. 2025. “Vggt: Visual Geometry Grounded Transformer.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 5294–5306.
Zhao, Wang, Yan-Pei Cao, Jiale Xu, Yuejiang Dong, and Ying Shan. 2025. “Di-Pcg: Diffusion-Based Efficient Inverse Procedural Content Generation for High-Quality 3d Asset Creation.” In Proceedings of the Computer Vision and Pattern Recognition Conference, 11061–72.