Resources
Datasets
Here is a list of publicly available datasets that were introduced in our works
- [2021] - HW-SQuAD and BenthamQA - QA over document images collection - Download Associated paper
- [2021] - Infographic VQA - VQA on infographics - check the Downloads tab here
- [2020] - DocVQA - Dataset for VQA on document images - check the Downloads tab here
- [2020] - RoadText1K - A dataset for text detection, recognition and tracking in driving videos - Project page
- [2018] - LectureVideoDB - A dataset for text detection and recognition in lecture videos - Project page
- [2018] - IIIT Handwritten words recognition dataset for Devanagari and Telugu - Project page
- [2018] - IIIT Urdu OCR dataset - Printed text recognition dataset for urdu containing cropped text lines - Project page
- [2017] - IIIT Arabic dataset - Arabic scene text detection and recognition dataset - Project page
- [2017] - Synthetic scene text word images for Hindi Telugu and Malayalam - Download Associated paper - Download Project page Associated paper
- [2017] - IIIT-ILST - Scene text dataset (both detection and recognition) for Hindi Telugu and Malayalam - Download Project page Associated paper
- [2016] - Hindi 100 pages Dataset for printed text recognition - Downlaod: Mirror 1 Mirror 2 Associated paper
Code
- Indian scripts scene text rendering and font collection
- BERT baseline for DocVQA
- MMBERT for Medical VQA
Other resources - Demos, presentations
- [2022] InfographicVQA presentation at WACV 2022 - video
- [2021] DocVQA workshop at ICDAR 2021 - All talks and challenge presentations - link
- [2021] Asking questions on HW Document collectons - presentation at ICDAR 2021- video
- [2021] DocVQA presentation at WACV 2021 - video
- [2020] Text and Documents in Deep learning Era Workshop - DocVQA challenge presentation - video
- [2016] OCR project at CVIT; summary : video
- [2016] Capture and read - OCR + TTS Android app demo : video1 video2