Ciao! I am Roberto Amoroso, a Research Engineer at NVIDIA in Munich, Germany 🇩🇪, working on Multimodal Video Understanding for Autonomous Vehicles. I enjoy designing and implementing novel Deep Learning and Computer Vision techniques.
I completed my PhD through the ELLIS program and the International Doctorate in ICT at the AImageLab research group of the University of Modena and Reggio Emilia (UNIMORE) 🇮🇹, under the supervision of Prof. Rita Cucchiara and Prof. Lorenzo Baraldi.
During my PhD, I also completed a PhD internship at LMU - Ludwig-Maximilians-Universität of Munich, in Germany 🇩🇪, focusing on Multimodal LLM for Video Question Answering and Open-vocabulary Segmentation, under the co-supervision of Prof. Volker Tresp.
I was also a Research Scholar at the Networking Research Group in Saint Louis, USA 🇺🇸, working on Super-resolution techniques applied to Internet traffic matrices.
My primary areas of research are Multimodal Video Understanding and Open-vocabulary Segmentation. In addition, I have also conducted research on the pre-training and optimization of Transformer-based architecture for image classification, self-supervised learning, deepfake detection of synthetic images, and the development of image watermarking systems.
Feel free to reach me out if you have any questions or curiosities! :)
ELLIS PhD in AI and Computer Vision, 2024
UNIMORE, Italy 🇮🇹 | LMU, Germany 🇩🇪 | NVIDIA, Germany 🇩🇪
MS in Artificial Intelligence, 2020
UNIMORE, Italy 🇮🇹 | AGH, Poland 🇵🇱 | Saint Louis University, USA 🇺🇸
BS in Computer Engineering, 2018
UNIMORE, Italy 🇮🇹
HumanE-AI-NET
project, funded by the EU Framework Programme for Research and Innovation Horizon 2020
.