ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.01767
  4. Cited By
PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task
  Completion

PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion

3 November 2023
Yiduo Guo
Zekai Zhang
Yaobo Liang
Dongyan Zhao
Duan Nan
    ELM
ArXivPDFHTML

Papers citing "PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion"

17 / 17 papers shown
Title
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
Jake Grigsby
Yuke Zhu
Michael S Ryoo
Juan Carlos Niebles
OffRL
VLM
34
0
0
06 May 2025
Topology-Aware Conformal Prediction for Stream Networks
Jifan Zhang
Fangxin Wang
Philip S. Yu
Kaize Ding
Shixiang Zhu
AI4TS
39
0
0
06 Mar 2025
Textual-to-Visual Iterative Self-Verification for Slide Generation
Textual-to-Visual Iterative Self-Verification for Slide Generation
Yunqing Xu
Xinbei Ma
Jiyang Qiu
Hai Zhao
60
0
0
24 Feb 2025
AgentStudio: A Toolkit for Building General Virtual Agents
AgentStudio: A Toolkit for Building General Virtual Agents
Longtao Zheng
Zhiyuan Huang
Zhenghai Xue
Xinrun Wang
Bo An
Shuicheng Yan
77
14
0
17 Feb 2025
Large Language Models as User-Agents for Evaluating
  Task-Oriented-Dialogue Systems
Large Language Models as User-Agents for Evaluating Task-Oriented-Dialogue Systems
Taaha Kazi
Ruiliang Lyu
Sizhe Zhou
Dilek Hakkani-Tür
Gökhan Tür
ELM
LLMAG
21
1
0
15 Nov 2024
Foundations and Recent Trends in Multimodal Mobile Agents: A Survey
Foundations and Recent Trends in Multimodal Mobile Agents: A Survey
Biao Wu
Yanda Li
Meng Fang
Zirui Song
Zhiwei Zhang
Yunchao Wei
L. Chen
LM&Ro
LLMAG
OffRL
AI4TS
39
4
0
04 Nov 2024
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents
Shihan Deng
Weikai Xu
Hongda Sun
Wei Liu
Tao Tan
...
Ang Li
Jian Luan
Bin Wang
Rui Yan
Shuo Shang
LLMAG
39
6
0
01 Jul 2024
Efficient Continual Pre-training by Mitigating the Stability Gap
Efficient Continual Pre-training by Mitigating the Stability Gap
Yiduo Guo
Jie Fu
Huishuai Zhang
Dongyan Zhao
Yikang Shen
30
12
0
21 Jun 2024
PPTC-R benchmark: Towards Evaluating the Robustness of Large Language
  Models for PowerPoint Task Completion
PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion
Zekai Zhang
Yiduo Guo
Yaobo Liang
Dongyan Zhao
Nan Duan
38
1
0
06 Mar 2024
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Zhiyong Wu
Chengcheng Han
Zichen Ding
Zhenmin Weng
Zhoumianze Liu
Shunyu Yao
Tao Yu
Lingpeng Kong
LLMAG
LM&Ro
113
83
0
12 Feb 2024
VRPTEST: Evaluating Visual Referring Prompting in Large Multimodal
  Models
VRPTEST: Evaluating Visual Referring Prompting in Large Multimodal Models
Zongjie Li
Chaozheng Wang
Chaowei Liu
Pingchuan Ma
Daoyuan Wu
Shuai Wang
Cuiyun Gao
VLM
19
6
0
07 Dec 2023
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of
  Large Language Models for Code Generation
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Jiawei Liu
Chun Xia
Yuyao Wang
Lingming Zhang
ELM
ALM
178
780
0
02 May 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
230
2,989
0
22 Mar 2023
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal
  Proofs
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Albert Q. Jiang
Sean Welleck
Jin Peng Zhou
Wenda Li
Jiacheng Liu
M. Jamnik
Timothée Lacroix
Yuhuai Wu
Guillaume Lample
AIMat
58
157
0
21 Oct 2022
Language Models are Multilingual Chain-of-Thought Reasoners
Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi
Mirac Suzgun
Markus Freitag
Xuezhi Wang
Suraj Srivats
...
Yi Tay
Sebastian Ruder
Denny Zhou
Dipanjan Das
Jason W. Wei
ReLM
LRM
170
324
0
06 Oct 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
4,048
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,217
0
21 Mar 2022
1