Roberto Amoroso
Roberto Amoroso
Home
News
Experience
Awards
Publications
Activities
Contact
Light
Dark
Automatic
T-Former
Perceive, Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries
[ WACV 2025 ]
We propose
PQR
, a novel LLM-based framework for video question answering that introduces
T-Former
, a question-guided temporal querying Transformer designed to efficiently extract and integrate video-specific features tailored to a given question.
Roberto Amoroso
,
Gengyuan Zhang
,
Rajat Koner
,
Lorenzo Baraldi
,
Rita Cucchiara
,
Volker Tresp
PDF
Cite
ArXiv
Cite
×