Workshop Agenda
Format: Single-track | In-person
Warm Up Section
08:00–08:30 | Registration
08:30–08:40 | Opening Remarks
Host: Workshop Chair(s)
Brief overview of the workshop theme and schedule.
Invited Speaker Talks
08:40–09:00 | Keynote Speak
Speaker: Jingyi Yu (ShanghaiTech University)
Title: TBD
Abstract: TBD
09:00–09:20 | Invited Talk 1
Speaker: Ziwei Liu (Nanyang Technological University)
Title: From Multimodal Generative Models to Dynamic World Modeling
Abstract: In this talk, Prof. Ziwei Liu will present recent advances in multimodal generative modeling, focusing on the integration of discrete diffusion timesteps to learn unified visual–language representations that enable high-fidelity generation across images, text, and video. Building upon these foundations, he will introduce dynamic world modeling frameworks capable of capturing the temporal evolution of complex environments, exemplified by the DynamicCity 4D LiDAR generation system for large‑scale, high‑quality scene synthesis. He will then discuss how these technologies pave the way toward embodied intelligence, uniting human perception, virtual avatars, and humanoid robotics. Finally, he will outline future directions and open challenges in bridging multimodal generative models with dynamic, real‑time world understanding
09:20–09:40 | Invited Talk 2
Speaker: Qi Ye (Zhejiang University)
Title: Understanding Hand-Object Interaction – From human hand reconstruction and generation to dexterous manipulation of robotic hands.
Abstract: In this talk, Prof. Qi Ye will present recent efforts to build perception systems that deeply understand hand–object interactions, spanning 3D human hand reconstruction, generative interaction modeling, and dexterous robotic manipulation. She will demonstrate how learning 3D generative models over human–object interactions enables accurate reconstruction from in‑the‑wild video clips, and how 2D generative frameworks can effectively guide real‑world robot actions. The talk will then cover methods for grasp generation and motion planning using human manipulation data, as well as multi‑modal pretraining techniques to transfer human‑like dexterous skills to robotic hands. Finally, Prof. Ye will outline open challenges and future directions in bridging human manipulation experience with embodied robotic dexterity.
09:40–10:00 | Invited Talk 3
Speaker: Li Yi (Tsinghua University)
Title: TBD
Abstract: TBD
10:00–10:30 | Coffee Break (Buffet / Group Table Discussions)
Open discussion and Ending
10:30–11:00 | Panel Discussion
Title: Tri-bridging: A visionary on threading Human, Avatar and Humanoid
Moderators: Congsheng Xu, Kaixuan Wang
Panelists: Lan Xu(ShanghaiTechU), Wentao Zhu(EIT), Qixuan Zhang(Rodin CTO), Yan Zhang(MeshCapade), Daniel Holden(Epic Games).
11:00–11:30 | Poster Session & Demos
Featuring selected submissions, ongoing projects, and institutional showcases.
11:40–12:00 | Closing Remarks & Best Poster Award
Remark the outstanding paper and best paper.