My work focuses on bridging vision and language for multimedia understanding. I research deep learning approaches to address the semantic gap between visual and textual modalities, by leveraging on media context. Currently working on Deep learning models for Multimedia and Multimodal Dialog systems.
Expertise and Interests: Multimedia Understanding. Multimodal machine learning. Neural-based Representation Learning. Computer vision and natural language processing. My core research interests lye around the field of multimedia understanding, namely in the intersection of computer vision and NLP, Deep Neural Networks, Artificial Intelligence and Data Mining.
Current positions (M.Sc. and Ph.D.): I’m currently looking for motivated students. If you are interested in working on the fields of deep learning for multimedia, vision and language understanding (dialog systems, visual question-answering, image-captioning, etc.) or neural representation learning, drop me an email!