ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18641
  4. Cited By
Enhanced Chart Understanding in Vision and Language Task via Cross-modal
  Pre-training on Plot Table Pairs

Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs

29 May 2023
Mingyang Zhou
Yi Ren Fung
Long Chen
Christopher Thomas
Heng Ji
Shih-Fu Chang
ArXivPDFHTML

Papers citing "Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs"

9 / 9 papers shown
Title
Socratic Chart: Cooperating Multiple Agents for Robust SVG Chart Understanding
Socratic Chart: Cooperating Multiple Agents for Robust SVG Chart Understanding
Yuyang Ji
Haohan Wang
LRM
37
0
0
14 Apr 2025
On Pre-training of Multimodal Language Models Customized for Chart
  Understanding
On Pre-training of Multimodal Language Models Customized for Chart Understanding
Wan-Cyuan Fan
Yen-Chun Chen
Mengchen Liu
Lu Yuan
Leonid Sigal
36
5
0
19 Jul 2024
ChatBCG: Can AI Read Your Slide Deck?
ChatBCG: Can AI Read Your Slide Deck?
Nikita Singh
Rob Balian
Lukas Martinelli
33
0
0
16 Jul 2024
Neuro-Inspired Information-Theoretic Hierarchical Perception for
  Multimodal Learning
Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning
Xiongye Xiao
Gengshuo Liu
Gaurav Gupta
De-An Cao
Shixuan Li
Yaxing Li
Tianqing Fang
Mingxi Cheng
Paul Bogdan
30
9
0
15 Apr 2024
SIMPLOT: Enhancing Chart Question Answering by Distilling Essentials
SIMPLOT: Enhancing Chart Question Answering by Distilling Essentials
Wonjoong Kim
S. Park
Yeonjun In
Seokwon Han
Chanyoung Park
LRM
ReLM
32
3
0
22 Feb 2024
ChartAssisstant: A Universal Chart Multimodal Language Model via
  Chart-to-Table Pre-training and Multitask Instruction Tuning
ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning
Fanqing Meng
Wenqi Shao
Quanfeng Lu
Peng Gao
Kaipeng Zhang
Yu Qiao
Ping Luo
27
45
0
04 Jan 2024
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language
  Understanding
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Kenton Lee
Mandar Joshi
Iulia Turc
Hexiang Hu
Fangyu Liu
Julian Martin Eisenschlos
Urvashi Khandelwal
Peter Shaw
Ming-Wei Chang
Kristina Toutanova
CLIP
VLM
158
263
0
07 Oct 2022
PreSTU: Pre-Training for Scene-Text Understanding
PreSTU: Pre-Training for Scene-Text Understanding
Jihyung Kil
Soravit Changpinyo
Xi Chen
Hexiang Hu
Sebastian Goodman
Wei-Lun Chao
Radu Soricut
VLM
135
29
0
12 Sep 2022
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
253
525
0
04 Feb 2021
1