Projects

  • Document Visual Question Answering [2019 - Present]
    • This project was concieved as a joint effort between CVIT, IIIT Hyderabad and CVC, UAB Barcelona. The primary focus of the project is to motivate the Documents community to look beyond traditinal document analysis tasks and to strive for buiding systems with true “Document Understing” capabilities. More details are available at docvqa.org
  • Scene Text Understanding [2015 - 2019]
    • CVIT has been working in this space for the last few years and has made significant contributions in scene text recognition prior to the deep learning wave. IIIT5k scene text dataset is one of the most widely used datasets in this field. I have joined this project recently, and I am looking into scene text recognition in an unconstrained manner in a seq2seq framework. More details on our work in this are can be found here
  • Indian Languages OCR [2013 - Present]
    • IIIT Hyderabad has been involved with the development of OCR for Indian languages since the conception of DLI project by the goverment of India. I was fortunate to join this group here and contribute towards a crucial technology in Indian language computing space. Despite the myriad of challenges in the Indian language space, compared to the Latin counterparts, we could achieve state of the art recognition accuracies in 12+ Indian languages. We follow a segmentation free approach to directly transcribe the text lines into sequences of unicodes. At this point we are trying to make our system available to the public. We are also looking forward to possible collaborations in terms of digitzing vast amounts of Indian language documents and in the development of assistive technologies for the visually challenged. More details here
  • Audio Books for the Visually Challenged [2013 Summer]
    • The project was an offshoot project of the OCR project. Here we worked in collaboration with the Speech lab in IIIT to make audio books in DAISY standard for the visually challenged. An OCR+TTS workflow was setup starting from scanning of the document . The audio books made were wrapped into web and desktop based apps and deployed at various Blind schools in Kerala. It was more like a pilot project to assess the performaces of both the OCR and speech synthesizer and to collect the responses from the visually challenged community.
  • Router Security using raw sockets @ Cisco, Bangalore - Undergrad project [2008]
    • To develop an access control framework for router security. Access Control Lists(ACL) filter network traffic by controlling whether routed packets are forwarded at the router. The decision wheether to drop or forward the packet is based on the filter criteria set using the ACL.