Roberto Amoroso
Roberto Amoroso
Home
News
Experience
Awards
Publications
Activities
Contact
Light
Dark
Automatic
CLIP
FreeDA: Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation
[ CVPR 2024 ]
We present
FreeDA
, a novel training-free diffusion-augmented method for open-vocabulary segmentation, which leverages diffusion models to visually localize generated concepts and local-global similarities to match superpixel-based class-agnostic regions with semantic classes.
Luca Barsellotti
,
Roberto Amoroso
,
Marcella Cornia
,
Lorenzo Baraldi
,
Rita Cucchiara
Cite
Project
FOSSIL: Free Open-Vocabulary Semantic Segmentation through Synthetic References Retrieval
[ WACV 2024 ]
We present
FOSSIL
, a novel Unsupervised Open-Vocabulary Semantic Segmentation model that enables a self-supervised visual backbone to perform open-vocabulary segmentation directly on the visual modality by retrieving a support set of generated synthetic references.
Luca Barsellotti
,
Roberto Amoroso
,
Lorenzo Baraldi
,
Rita Cucchiara
PDF
Cite
MaPeT: Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training
We propose a novel self-supervised pre-training technique for Vision Transformer called
MaPeT
and a novel image tokenizer called
k
-CLIP
which directly employs discretized CLIP features.
Lorenzo Baraldi
,
Roberto Amoroso
,
Marcella Cornia
,
Lorenzo_Baraldi
,
Andrea Pilzer
,
Rita Cucchiara
PDF
Cite
Code
ArXiv
Cite
×