Projects

Systems for open 3D world models

The four project groups below correspond to the lab's shared research directions. Each group expands into concrete papers and code status.

Selected Projects

Foundation Representation and Manipulation

Compact and controllable 3D asset representations for reconstruction, editing, decomposition, and state prediction.

3D Foundation

P2Voxel: Pyramid Pivot Voxelization for 3D Mesh Tokenization

A compact 3D mesh tokenization project for efficient VAE reconstruction and foundation representation learning.

Paper: manuscript, 2026 Code: to be released

3D Structure

Hi-TOPS: Hierarchical Topology-aware Scoring Prior for 3D Part Decomposition

A topology-aware prior for decomposing 3D meshes into meaningful parts for structure-aware manipulation.

Paper: manuscript, 2026 Code: to be released

Spatial Understanding and Synthesis

Agentic pipelines and engine-grounded tools for coherent 3D scenes, simulation data, and spatial evaluation.

ECCV | Team Milestone

StoryBlender: Inter-Shot Consistent and Editable 3D Storyboard

EngineeringAI Lab's first team-led ECCV paper, building editable multi-shot 3D story worlds with spatial-temporal consistency.

Paper: arXiv Code: to be released

Scene Attention

Look-Before-Move: Narrative-Grounded World Visual Attention

A world visual attention framework for viewpoint selection and camera trajectories in dynamic 3D story worlds.

Paper: arXiv Code: to be released

Interaction Dynamics and Motion

Expressive avatars, human-scene interaction, and multi-person motion generation for dynamic digital worlds.

Talking Avatar

XTalker: Turn, Smile, and Speak in Controllable Talking Portrait Animation

A controllable portrait animation framework for identity-preserving, expressive, and synchronized talking faces.

Paper: arXiv Code: to be released

3D Avatar

3DXTalker: Expressive 3D Talking Avatars

A 3D talking-avatar system unifying identity, lip sync, emotion, and spatial dynamics.

Paper: arXiv Code: to be released

Motion

Social Structure Matters in 3D Human-Human Interaction Generation

A multi-person motion generation project focused on social coherence, interpersonal relations, and plausible dynamic interaction.

Paper: manuscript, 2026 Code: to be released

Decision Intelligence and Reasoning

LLM-based reasoning, language-guided planning, reinforcement learning, and decision systems for agents in 3D worlds.

Decision Learning

Scalable In-Context Q-Learning

A scalable in-context decision learning framework for adapting policies from task context.

Paper: ICLR 2026 Code: to be released

Decision Agent

Text-to-Decision Agent

Offline meta-reinforcement learning from natural language supervision for decision-making agents.

Paper: NeurIPS 2025 Code: to be released

Reasoning

Escaping Confidence Trap

Evolutionary decoding for mathematical reasoning in diffusion LLMs.

Paper: manuscript, 2026 Code: to be released