VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded TextInternational Conference on Learning Representations (ICLR), 2024 |
Dense Video Object Captioning from Disjoint SupervisionInternational Conference on Learning Representations (ICLR), 2023 |
WizardCoder: Empowering Code Large Language Models with Evol-InstructInternational Conference on Learning Representations (ICLR), 2023 |
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App ScreenshotsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022 Yu-Chung Hsiao Fedir Zubach Maria Wang Jindong Chen Victor Carbune Jason Lin Maria Wang Yun Zhu Jindong Chen |