I'm interested in robotics, computer vision, physics simulation, and machine learning.
My research explores the intersection of learned world models and physics-based simulation to model the dynamics and appearance of the real world,
with the goal of reducing the sim-to-real gap and scaling synthetic data for robotic manipulation.
If you'd like to discuss research opportunity, collaboration, Ph.D. application, or anything related, feel free to reach out via email:
kaifeng dot z at columbia dot edu.
We propose a framework for robot policy evaluation in simulation environments,
using Gaussian Splatting for rendering and soft-body digital twin for dynamics.
We present an interactive digital twin construction (real-to-sim) framework that learns
the full dynamics of elastoplastic articulated objects from videos.
We optimize a spring-mass physics model of deformable objects and
integrate the model with 3D Gaussian Splatting for real-time re-simulation with rendering.
We propose a neural particle-grid model for training dynamics model with real-world sparse-view RGB-D videos, enabling
high-quality future prediction and rendering.
We learn neural dynamics models of objects from real perception data
and combine the learned model with 3D Gaussian Splatting for action-conditioned predictive rendering.
We learn a material-conditioned neural dynamics model using graph neural network to
enable predictive modeling of diverse real-world objects and achieve efficient manipulation via model-based planning.
We propose a fully self-supervised method for category-level 6D object pose estimation
by learning dense 2D-3D geometric correspondences. Our method can train on image collections
without any 3D annotations.
We show that fusing fine-grained features learned with low-level contrastive objectives and semantic features
from image-level objectives can improve SSL pretraining.