Sindhu B. Hegde
PhD Student, University of Oxford

Hi! I am a third year PhD student in the Visual Geometry Group (VGG) at the University of Oxford, supervised by Prof. Andrew Zisserman. My research is in Computer Vision, particularly in co-speech gesture understanding, video understanding, and self-supervised learning.
Prior to joining Oxford, I worked as a Lead Data Scientist @ Verisk Analytics. Before that, I pursued Masters’ by Research (MS) at Centre for Visual Information Technology (CVIT), IIIT Hyderabad supervised by Prof. C V Jawahar (IIIT-H) and Prof. Vinay Namboodiri (University of Bath, UK). My Masters’ research focused on exploiting the redundancies in vision and speech modalities for cross-modal generation.
Research interests: Computer Vision, Machine Learning, Deep Learning, Video Understanding, Multi-modal Learning: Vision + Speech/Language
News [Archive]
Apr 2025 |
Our paper on Understanding Co-speech Gestures in-the-wild is up on arXiv. Links: Project page , Dataset |
---|---|
Jan 2025 |
Our paper on Scaling Multilingual Visual Speech Recognition accepted to ICASSP 2025 (ORAL). Links: Project page , Dataset |
Sep 2023 |
Our paper on GestSync: Determining who is speaking without a talking head accepted to BMVC 2023 (ORAL). Links: Project page , Demo |
Jul 2023 | Participated in the International Computer Vision Summer School (ICVSS) at Sicily, Italy. Had an eincredible experience of learning from some of the most distinguished computer vision experts! |
Oct 2022 | Joined the Visual Geometry Group (VGG) at the University of Oxford as a PhD student with Prof. Andrew Zisserman |