ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.18549
  4. Cited By
MSA at BEA 2025 Shared Task: Disagreement-Aware Instruction Tuning for Multi-Dimensional Evaluation of LLMs as Math Tutors

MSA at BEA 2025 Shared Task: Disagreement-Aware Instruction Tuning for Multi-Dimensional Evaluation of LLMs as Math Tutors

24 May 2025
Baraa Hikal
Mohamed Basem
Islam Oshallah
Ali Hamdi
    LRM
ArXiv (abs)PDFHTML

Papers citing "MSA at BEA 2025 Shared Task: Disagreement-Aware Instruction Tuning for Multi-Dimensional Evaluation of LLMs as Math Tutors"

11 / 11 papers shown
Title
Stepwise Verification and Remediation of Student Reasoning Errors with
  Large Language Model Tutors
Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors
Nico Daheim
Jakub Macina
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
LRM
78
12
0
12 Jul 2024
Bridging the Novice-Expert Gap via Models of Decision-Making: A Case
  Study on Remediating Math Mistakes
Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistakes
Rose E. Wang
Qingyang Zhang
Carly Robinson
Susanna Loeb
Dorottya Demszky
125
38
0
16 Oct 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
531
12,128
0
18 Jul 2023
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties
  Grounded in Math Reasoning Problems
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems
Jakub Macina
Nico Daheim
Sankalan Pal Chowdhury
Tanmay Sinha
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
LRM
123
68
0
23 May 2023
QLoRA: Efficient Finetuning of Quantized LLMs
QLoRA: Efficient Finetuning of Quantized LLMs
Tim Dettmers
Artidoro Pagnoni
Ari Holtzman
Luke Zettlemoyer
ALM
163
2,641
0
23 May 2023
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and
  GPT-3 in Educational Dialogues
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues
Anaïs Tack
Chris Piech
ELM
101
94
0
16 May 2022
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLMOffRLLRM
425
4,608
0
27 Oct 2021
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRLAI4TSAI4CEALMAIMat
711
10,634
0
17 Jun 2021
Measuring Conversational Uptake: A Case Study on Student-Teacher
  Interactions
Measuring Conversational Uptake: A Case Study on Student-Teacher Interactions
Dorottya Demszky
Jing Liu
Zid Mancenido
Julie Cohen
H. Hill
Dan Jurafsky
Tatsunori Hashimoto
127
67
0
07 Jun 2021
Measuring Mathematical Problem Solving With the MATH Dataset
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
Basel Alomair
Jacob Steinhardt
ReLMFaML
233
2,414
0
05 Mar 2021
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCVBDL
943
5,858
0
05 Dec 2016
1