v1v2 (latest)

The Effect of Sampling Temperature on Problem Solving in Large Language Models

7 February 2024

Matthew Renze

Erhan Guven

ArXiv (abs)PDF HTML Github (21★)

Papers citing "The Effect of Sampling Temperature on Problem Solving in Large Language Models"

50 / 60 papers shown

Temperature in SLMs: Impact on Incident Categorization in On-Premises Environments

21 Nov 2025

The Shifting Landscape of Vaccine Discourse: Insights From a Decade of Pre- to Post-COVID-19 Vaccine Posts on Social MediaPLoS ONE (PLoS ONE), 2025

20 Nov 2025

What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity

Alexis Audran-Reiss

Jordi Armengol-Estapé

...

259

19 Nov 2025

HEDGE: Hallucination Estimation via Dense Geometric Entropy for VQA with Vision-Language Models

251

16 Nov 2025

PublicAgent: Multi-Agent Design Principles From an LLM-Based Open Data Analysis Framework

227

04 Nov 2025

Optimal Attention Temperature Enhances In-Context Learning under Distribution Shift

Samet Demir

Zafer Dogan

157

03 Nov 2025

G2: Guided Generation for Enhanced Output Diversity in LLMs

166

01 Nov 2025

Stable LLM Ensemble: Interaction between Example Representativeness and Diversity

Junichiro Niimi

217

15 Oct 2025

Let it Calm: Exploratory Annealed Decoding for Verifiable Reinforcement Learning

208

06 Oct 2025

OptAgent: Optimizing Query Rewriting for E-commerce via Multi-Agent Simulation

232

04 Oct 2025

On the Role of Temperature Sampling in Test-Time Scaling

175

02 Oct 2025

When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs

Shree Harsha Bokkahalli Satish

G. Henter

Éva Székely

355

01 Oct 2025

Artificial Phantasia: Evidence for Propositional Reasoning-Based Mental Imagery in Large Language Models

Morgan McCarty

Jorge Morales

LRM

154

27 Sep 2025

Automated Extraction of Material Properties using LLM-based AI Agents

Subham Ghosh

Abhishek Tewari

143

23 Sep 2025

Control the Temperature: Selective Sampling for Diverse and High-Quality LLM Outputs

176

20 Sep 2025

Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data

Nattadaporn Lertcheva

217

20 Sep 2025

LTA-thinker: Latent Thought-Augmented Training Framework for Large Language Models on Complex Reasoning

324

16 Sep 2025

Rethinking the Evaluation of Alignment Methods: Insights into Diversity, Generalisation, and Safety

224

16 Sep 2025

Building Coding Agents via Entropy-Enhanced Multi-Turn Preference Optimization

256

15 Sep 2025

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

...

326

11 Sep 2025

Acquiescence Bias in Large Language Models

Daniel Braun

AI4CE

244

10 Sep 2025

ReCode: Improving LLM-based Code Repair with Fine-Grained Retrieval-Augmented Generation

204

02 Sep 2025

Error Notebook-Guided, Training-Free Part Retrieval in 3D CAD Assemblies via Vision-Language Models

280

01 Sep 2025

Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation

...

149

16 Aug 2025

Are Large Language Models Dynamic Treatment Planners? An In Silico Study from a Prior Knowledge Injection Angle

Zhiyao Luo

T. Zhu

OffRL LM&MA

171

06 Aug 2025

Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling

...

155

31 Jul 2025

Mind the Language Gap in Digital Humanities: LLM-Aided Translation of SKOS Thesauri

197

22 Jul 2025

From Queries to Criteria: Understanding How Astronomers Evaluate LLMs

217

21 Jul 2025

Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models

Ken Tsui

KELM LRM

294

03 Jul 2025

Semantic-guided Diverse Decoding for Large Language Model

292

30 Jun 2025

LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

Chenghao Yang

Ari Holtzman

329

22 Jun 2025

Fragile Preferences: A Deep Dive Into Order Effects in Large Language Models

Haonan Yin

Shai Vardi

Vidyanand Choudhary

284

17 Jun 2025

Don't throw the baby out with the bathwater: How and why deep learning for ARC

Jack Cole

Mohamed Osman

LRM

409

17 Jun 2025

Tracing LLM Reasoning Processes with Strategic Games: A Framework for Planning, Revision, and Resource-Constrained Decision Making

375

13 Jun 2025

Knowing Before Saying: LLM Representations Encode Information About Chain-of-Thought Success Before CompletionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

319

30 May 2025

VModA: An Effective Framework for Adaptive NSFW Image Moderation

268

29 May 2025

CHART-6: Human-Centered Evaluation of Data Visualization Understanding in Vision-Language Models

246

22 May 2025

SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning

739

16 May 2025

DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs

Lake Yin

Fan Huang

350

15 May 2025

Atomic Consistency Preference Optimization for Long-Form Question Answering

Jingfeng Chen

Raghuveer Thirukovalluru

332

14 May 2025

Can Large Language Models Predict Parallel Code Performance?IEEE International Symposium on High-Performance Parallel Distributed Computing (HPDC), 2025

298

06 May 2025

LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval

Muhammad Rafsan Kabir

Rafeed Mohammad Sultan

285

19 Apr 2025

M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

458

14 Apr 2025

Has the Creativity of Large-Language Models peaked? An analysis of inter- and intra-LLM variability

329

10 Apr 2025

A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility

Christian Schroeder de Witt

Matthias Bethge

ReLM ALM LRM

691

09 Apr 2025

Emotion Recognition Using Convolutional Neural Networks

428

03 Apr 2025

Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap

Tong Nie

Jian Sun

Wei Ma

721

27 Mar 2025

StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

514

26 Mar 2025

Agents in the Sandbox: End-to-End Crash Bug Reproduction for Minecraft

Eray Yapağcı

Yavuz Alp Sencer Öztürk

Eray Tüzün

235

25 Mar 2025

LEMMA: Learning from Errors for MatheMatical Advancement in LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

510

21 Mar 2025