Media Summary: We introduce SeqAfford, a Multi-Modal Language Model (MLLM) capable of serialized affordance inference implied in human ... In this paper, we propose VideoScene that distills the video diffusion model to generate In this work, we introduce PartGen, a novel method for compositional/part-level

Cvpr 2025 Highlight Crossover 3d - Detailed Analysis & Overview

We introduce SeqAfford, a Multi-Modal Language Model (MLLM) capable of serialized affordance inference implied in human ... In this paper, we propose VideoScene that distills the video diffusion model to generate In this work, we introduce PartGen, a novel method for compositional/part-level CVPR 2025 - vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation Cross-modal Causal Relation Alignment for Video Question Grounding.

Photo Gallery

[CVPR 2025, Highlight] CrossOver: 3D Scene Cross-Modal Alignment
[CVPR 2025] SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
[CVPR 2025 Highlight] VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step
[CVPR 2025] PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models
[CVPR 2025] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
CVPR 2025 Highlights: AI, Computer Vision, and What’s Next
CVPR 2025 - vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation
CVPR 2025 Highlight: Parallelized Autoregressive Visual Generation.
CRA-GQA | CVPR 2025 Highlight
(CVPR 2025) Coherent 3D Portrait Video Reconstruction via Triplane Fusion
NVIDIA Research Breakthroughs at CVPR 2025
[CVPR 2025] TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing
View Detailed Profile
[CVPR 2025, Highlight] CrossOver: 3D Scene Cross-Modal Alignment

[CVPR 2025, Highlight] CrossOver: 3D Scene Cross-Modal Alignment

Abstract: Multi-modal

[CVPR 2025] SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model

[CVPR 2025] SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model

We introduce SeqAfford, a Multi-Modal Language Model (MLLM) capable of serialized affordance inference implied in human ...

[CVPR 2025 Highlight] VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

[CVPR 2025 Highlight] VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

In this paper, we propose VideoScene that distills the video diffusion model to generate

[CVPR 2025] PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models

[CVPR 2025] PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models

In this work, we introduce PartGen, a novel method for compositional/part-level

[CVPR 2025] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors

[CVPR 2025] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors

Project Page: https://edexheim.github.io/mast3r-slam Paper: https://arxiv.org/abs/2412.12392 Code: ...

CVPR 2025 Highlights: AI, Computer Vision, and What’s Next

CVPR 2025 Highlights: AI, Computer Vision, and What’s Next

Experience

CVPR 2025 - vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation

CVPR 2025 - vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation

CVPR 2025 - vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation

CVPR 2025 Highlight: Parallelized Autoregressive Visual Generation.

CVPR 2025 Highlight: Parallelized Autoregressive Visual Generation.

Project page: https://yuqingwang1029.github.io/PAR-project/, Code: https://github.com/YuqingWang1029/PAR.

CRA-GQA | CVPR 2025 Highlight

CRA-GQA | CVPR 2025 Highlight

Cross-modal Causal Relation Alignment for Video Question Grounding.

(CVPR 2025) Coherent 3D Portrait Video Reconstruction via Triplane Fusion

(CVPR 2025) Coherent 3D Portrait Video Reconstruction via Triplane Fusion

Recent breakthroughs in single-image

NVIDIA Research Breakthroughs at CVPR 2025

NVIDIA Research Breakthroughs at CVPR 2025

See what's new from NVIDIA Research at

[CVPR 2025] TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing

[CVPR 2025] TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing

[

[CVPR 2025] SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

[CVPR 2025] SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

Project Page: https://cdfan0627.github.io/spectromotion Paper: https://arxiv.org/abs/2410.17249 Code: ...