Sindhu B. Hegde
PhD Student, University of Oxford
Hi! I am a fourth year PhD student in the Visual Geometry Group (VGG) at the University of Oxford, supervised by Prof. Andrew Zisserman. My research is in Computer Vision, particularly in understanding non-verbal communication (including co-speech gestures and lip-reading), video understanding, and self-supervised learning. I also work as an AI Scientist at Rode Microphones, focusing on multimodal LLM-based research.
Prior to joining Oxford, I worked as a Lead Data Scientist @ Verisk Analytics. Before that, I pursued Masters’ by Research (MS) at Centre for Visual Information Technology (CVIT), IIIT Hyderabad supervised by Prof. C V Jawahar (IIIT-H) and Prof. Vinay Namboodiri (University of Bath, UK). My Masters’ research focused on exploiting the redundancies in vision and speech modalities for cross-modal generation. Earlier, I completed my undergraduate studies at KLE Technological University, advised by Prof. Shankar Gangisetty and Prof. Uma Mudenagudi.
Research interests: Computer Vision, Machine Learning, Deep Learning, Video Understanding, Multi-modal Learning: Vision + Speech/Language
News [Archive]
| Nov 2025 | I am honoured to have been awarded the 2025 Google PhD Fellowship in Machine Perception. |
|---|---|
| Jul 2025 | JEGAL has been accepted to ICCV 2025 (ORAL). See you in Hawaii 🏝️⛱️ 🌊 |
| Apr 2025 |
Our paper on Understanding Co-speech Gestures in-the-wild is up on arXiv. Links: Project page, Dataset |
| Jan 2025 |
Our paper on Scaling Multilingual Visual Speech Recognition accepted to ICASSP 2025 (ORAL). Links: Project page, Dataset |
| Sep 2023 |
Our paper on GestSync: Determining who is speaking without a talking head accepted to BMVC 2023 (ORAL). Links: Project page, Demo |
| Jul 2023 | Participated in the International Computer Vision Summer School (ICVSS) at Sicily, Italy. Had an incredible experience of learning from some of the most distinguished computer vision experts! |
Talks
| Oct 2025 | Invited talk on “Understanding Co-speech Gestures in Videos” at the Berkeley AI Research Lab (BAIR) , University of California, Berkeley. Hosted by Prof. Alyosha Efros. |
|---|