v1v2v3v4 (latest)

LLM-SQL-Solver: Can LLMs Determine SQL Equivalence?

16 December 2023

Amr El Abbadi

ArXiv (abs)PDF HTML Github (105★)

Papers citing "LLM-SQL-Solver: Can LLMs Determine SQL Equivalence?"

41 / 41 papers shown

Access Paths for Efficient Ordering with Large Language Models

Dimitris Tsirogiannis

239

30 Aug 2025

Taming SQL Complexity: LLM-Based Equivalence Evaluation for Text-to-SQL

191

11 Jun 2025

QUITE: A Query Rewrite System Beyond Rules with LLM Agents

400

09 Jun 2025

Text-to-SQL Domain Adaptation via Human-LLM Collaborative Data AnnotationInternational Conference on Intelligent User Interfaces (IUI), 2025

753

21 Feb 2025

EquiBench: Benchmarking Large Language Models' Reasoning about Program Semantics via Equivalence Checking

...

Thiago S. F. X. Teixeira

Diyi Yang

Ke Wang

LRM

437

18 Feb 2025

From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

...

1.3K

405

25 Nov 2024

FLEX: Expert-level False-Less EXecution Metric for Reliable Text-to-SQL Benchmark

430

24 Sep 2024

Hybrid Querying Over Relational Databases and Large Language Models

T. Pham

Cody T. Reynolds

A. El Abbadi

301

01 Aug 2024

Benchmarking Complex Instruction-Following with Multiple Constraints Composition

...

Jie Tang

Hongning Wang

Minlie Huang

CoGe

517

121

04 Jul 2024

Chain-of-Table: Evolving Tables in the Reasoning Chain for Table UnderstandingInternational Conference on Learning Representations (ICLR), 2024

Zilong Wang

Hao Zhang

Chun-Liang Li

Julian Martin Eisenschlos

Vincent Perot

...

Lesly Miculicich

Yasuhisa Fujii

Jingbo Shang

Chen-Yu Lee

Tomas Pfister

ReLM LMTD LRM

302

229

09 Jan 2024

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGIComputer Vision and Pattern Recognition (CVPR), 2023

...

950

1,898

27 Nov 2023

CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Weixiang Yan

Haitian Liu

Yunkun Wang

Yunzhe Li

Qian Chen

...

473

14 Nov 2023

Language Models can be Logical Solvers

332

10 Nov 2023

Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves

Quanquan Gu

567

146

07 Nov 2023

GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks

Heng Wang

274

131

02 Nov 2023

CodeTransOcean: A Comprehensive Multilingual Benchmark for Code TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Weixiang Yan

Yuchen Tian

Yunzhe Li

Qian Chen

Wen Wang

449

08 Oct 2023

Text-to-SQL Empowered by Large Language Models: A Benchmark EvaluationProceedings of the VLDB Endowment (PVLDB), 2023

Jingren Zhou

656

554

29 Aug 2023

Llama 2: Open Foundation and Fine-Tuned Chat Models

Louis Martin

...

Sharan Narang

Sergey Edunov

12.1K

16,310

18 Jul 2023

C3: Zero-shot Text-to-SQL with ChatGPT

424

230

14 Jul 2023

ToolQA: A Dataset for LLM Question Answering with External ToolsNeural Information Processing Systems (NeurIPS), 2023

392

356

23 Jun 2023

Judging LLM-as-a-Judge with MT-Bench and Chatbot ArenaNeural Information Processing Systems (NeurIPS), 2023

...

3.4K

7,658

09 Jun 2023

Large Language Models are not Fair EvaluatorsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Peiyi Wang

Lei Li

Zefan Cai

Qi Liu

Zhifang Sui

806

880

29 May 2023

How Language Model Hallucinations Can SnowballInternational Conference on Machine Learning (ICML), 2023

Ofir Press

414

394

22 May 2023

Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLsNeural Information Processing Systems (NeurIPS), 2023

...

Kevin C. C. Chang

Fei Huang

Reynold Cheng

Yongbin Li

LMTD

589

819

04 May 2023

Can Large Language Models Be an Alternative to Human Evaluations?Annual Meeting of the Association for Computational Linguistics (ACL), 2023

Cheng-Han Chiang

Hung-yi Lee

ALM LM&MA

647

930

03 May 2023

From Words to Code: Harnessing Data for Program Synthesis from Natural Language

...

351

02 May 2023

DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-CorrectionNeural Information Processing Systems (NeurIPS), 2023

Mohammadreza Pourreza

Davood Rafiei

ReLM LRM

418

616

21 Apr 2023

A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability

Aiwei Liu

Xuming Hu

Lijie Wen

Philip S. Yu

LMTD AI4MH

311

193

12 Mar 2023

Is ChatGPT better than Human Annotators? Potential and Limitations of ChatGPT in Explaining Implicit Hate SpeechThe Web Conference (WWW), 2023

314

322

11 Feb 2023

Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-ThoughtInternational Conference on Learning Representations (ICLR), 2022

Abulhair Saparov

He He

ELM LRM ReLM

1.0K

450

03 Oct 2022

Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022

3.6K

6,211

21 Mar 2022

Evaluating Large Language Models Trained on Code

...

2.6K

8,889

07 Jul 2021

KaggleDBQA: Realistic Evaluation of Text-to-SQL ParsersAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

328

139

22 Jun 2021

Semantic Evaluation for Text-to-SQL with Distilled Test Suites

Ruiqi Zhong

Tao Yu

Dan Klein

218

171

06 Oct 2020

Grounded Adaptation for Zero-shot Executable Semantic ParsingConference on Empirical Methods in Natural Language Processing (EMNLP), 2020

Victor Zhong

M. Lewis

Sida I. Wang

Luke Zettlemoyer

361

111

16 Sep 2020

Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020

...

2.3K

55,939

28 May 2020

AmbigQA: Answering Ambiguous Open-domain Questions

Sewon Min

Julian Michael

Hannaneh Hajishirzi

Luke Zettlemoyer

495

430

22 Apr 2020

RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL ParsersAnnual Meeting of the Association for Computational Linguistics (ACL), 2019

Bailin Wang

Richard Shin

Xiaodong Liu

Oleksandr Polozov

Matthew Richardson

636

767

10 Nov 2019

Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL TaskConference on Empirical Methods in Natural Language Processing (EMNLP), 2018

...

896

1,713

24 Sep 2018

Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

1.2K

1,471

31 Aug 2017

Attention Is All You NeedNeural Information Processing Systems (NeurIPS), 2017

8.3K

171,167

12 Jun 2017