Hi! I am a first year PhD student in the Visual Geometry Group (VGG) at the University of Oxford, supervised by Prof. Andrew Zisserman. My research is in Computer Vision, particularly in multimodal learning, video understanding and self-supervised learning.
Prior to joining Oxford, I worked as a Lead Data Scientist @ Verisk Analytics. Before that, I pursued Masters’ by Research (MS) at Centre for Visual Information Technology (CVIT), IIIT Hyderabad supervised by Prof. C V Jawahar (IIIT-H) and Prof. Vinay Namboodiri (University of Bath, UK). My Masters’ research focused on exploiting the redundancies in vision and speech modalities for cross-modal generation.
Research interests: Computer Vision, Machine Learning, Deep Learning, Video Understanding, Multi-modal Learning: Vision + Speech/Language
|Nov 2022||Submitted a paper to CVPR 2023! Stay tuned for more details…|
|Oct 2022||Joined the Visual Geometry Group (VGG) at the University of Oxford as a PhD student with Prof. Andrew Zisserman!|
2 papers accepted to ACM-MM 2022!
1] Talking-Face Video Upsampling 2] Lip-to-Speech Synthesis
Successfully defended MS thesis
Thesis: Exploiting Cross-Modal Redundancy for Audio-Visual Generation
|Apr 2022||Promoted to Lead Data Scientist at Verisk Analytics|
|Feb 2022||Participated in Research Week with Google. Got a chance to interact with amazing researchers all over the world!|
Recent papers [Full list]
ACM-MMExtreme-scale Talking-Face Video Upsampling with Audio-Visual PriorsIn Proceedings of the 30th ACM International Conference on Multimedia (MM’22) 2022
ACM-MMLip-to-Speech Synthesis for Arbitrary Speakers in the WildIn Proceedings of the 30th ACM International Conference on Multimedia (MM’22) 2022