Towards Vision-Language-Garment Models for Web Knowledge Garment Understanding and Generation

Overall framework

Abstract

This work studies vision-language-garment modeling for garment understanding and generation. It explores how web-scale multimodal reasoning transfers to garment synthesis from text and images, highlighting the potential of foundation models for specialized fashion design tasks.

Publication
arXiv preprint
Tong WU 吴桐
Tong WU 吴桐
Assistant Professor @ Fudan

My research interests include 3d vision, long-tailed recognition, and robustness.

Related