Infinite Gaze Generation for Videos with Autoregressive Diffusion

March 2026

Overall framework

Abstract

This work studies long-horizon raw gaze prediction for videos. It models continuous gaze trajectories with high-resolution timestamps using an autoregressive diffusion model conditioned on a saliency-aware visual latent space, improving long-range spatiotemporal accuracy and realism over short-window gaze prediction methods.

Type

Conference paper

Publication

In European Conference on Computer Vision (ECCV)

Source Themes