Infinite Gaze Generation for Videos with Autoregressive Diffusion

Overall framework

Abstract

This work studies long-horizon raw gaze prediction for videos. It models continuous gaze trajectories with high-resolution timestamps using an autoregressive diffusion model conditioned on a saliency-aware visual latent space, improving long-range spatiotemporal accuracy and realism over short-window gaze prediction methods.

Publication
In European Conference on Computer Vision (ECCV)
Tong WU 吴桐
Tong WU 吴桐
Assistant Professor @ Fudan

My research interests include 3d vision, long-tailed recognition, and robustness.

Related