v1v2v3 (latest)

Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes

Proceedings of the VLDB Endowment (PVLDB), 2023

19 April 2023

Papers citing "Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes"

50 / 87 papers shown

BookRAG: A Hierarchical Structure-aware Index-based Approach for Retrieval-Augmented Generation on Complex Documents

Shu Wang

Yingli Zhou

Yixiang Fang

175

03 Dec 2025

SRE-Llama -- Fine-Tuned Meta's Llama LLM, Federated Learning, Blockchain and NFT Enabled Site Reliability Engineering(SRE) Platform for Communication and Networking Software ServicesInternational Conference on Blockchain Computing and Applications (BCCA), 2025

11 Nov 2025

Structured RAG for Answering Aggregative Questions

255

11 Nov 2025

Cortex AISQL: A Production SQL Engine for Unstructured Data

...

Dimitris Tsirogiannis

137

10 Nov 2025

Attention and Compression is all you need for Controllably Efficient Language Models

467

07 Nov 2025

Relational Deep Dive: Error-Aware Queries Over Unstructured Data

102

04 Nov 2025

AGRAG: Advanced Graph-based Retrieval-Augmented Generation for LLMs

105

02 Nov 2025

FlashEVA: Accelerating LLM inference via Efficient Attention

Juan Gabriel Kostelec

Qinghai Guo

164

01 Nov 2025

Standardization of Psychiatric Diagnoses -- Role of Fine-tuned LLM Consortium and OpenAI-gpt-oss Reasoning LLM Enabled Decision Support System

183

29 Oct 2025

TEXT2DB: Integration-Aware Information Extraction with Large Language Model AgentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

137

28 Oct 2025

Agentsway -- Software Development Methodology for AI Agents-based Teams

...

326

26 Oct 2025

Model Context Contracts - MCP-Enabled Framework to Integrate LLMs With Blockchain Smart Contracts

...

21 Oct 2025

Implementing Semantic Join Operators Efficiently

Immanuel Trummer

105

09 Oct 2025

LLM/Agent-as-Data-Analyst: A Survey

...

236

28 Sep 2025

ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

109

16 Sep 2025

A Survey on Retrieval And Structuring Augmented Generation with Large Language Models

208

12 Sep 2025

Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees

Sepanta Zeighami

Shreya Shankar

Aditya G. Parameswaran

126

02 Sep 2025

A Survey on Open Dataset Search in the LLM Era: Retrospectives and Perspectives

31 Aug 2025

ST-Raptor: LLM-Powered Semi-Structured Table Question Answering

228

25 Aug 2025

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

246

21 Aug 2025

LoopServe: An Adaptive Dual-phase LLM Inference Acceleration System for Multi-Turn Dialogues

...

233

18 Jul 2025

Instruction Tuning with and without Context: Behavioral Shifts and Downstream Impact

246

18 Jun 2025

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

...

Blaise Agüera y Arcas

João Sacramento

311

05 Jun 2025

Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers

Kazuki Irie

Morris Yau

Samuel J. Gershman

221

31 May 2025

Towards Scalable Schema Mapping using Large Language Models

148

30 May 2025

ATLAS: Learning to Optimally Memorize the Context at Test Time

523

29 May 2025

SQUiD: Synthesizing Relational Databases from Unstructured Text

251

25 May 2025

How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation

216

24 May 2025

Efficient LLM Serving on Hybrid Real-time and Best-effort Requests

297

13 Apr 2025

Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous DatabasesIEEE International Conference on Data Engineering (ICDE), 2025

Teng Lin

301

08 Apr 2025

LLM-Aided Customizable Profiling of Code Data Based On Programming Language Concepts

Pankaj Thorat

Adnan Qidwai

Adrija Dhar

Aishwariya Chakraborty

Anand Eswaran

Hima Patel

Praveen Jayachandran

230

19 Mar 2025

Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models

261

21 Feb 2025

MoM: Linear Sequence Modeling with Mixture-of-Memories

555

19 Feb 2025

Graph-based Retrieval Augmented Generation for Dynamic Few-shot Text Classification

491

17 Feb 2025

CodeMonkeys: Scaling Test-Time Compute for Software Engineering

312

24 Jan 2025

Mind the Data Gap: Bridging LLMs to Enterprise Data Integration

207

31 Dec 2024

The Design of an LLM-powered Unstructured Analytics System

...

280

31 Dec 2024

Smoothie: Label Free Language Model RoutingNeural Information Processing Systems (NeurIPS), 2024

260

06 Dec 2024

Unlocking State-Tracking in Linear RNNs Through Negative EigenvaluesInternational Conference on Learning Representations (ICLR), 2024

Katharina Eggensperger

Massimiliano Pontil

747

19 Nov 2024

DocETL: Agentic Query Rewriting and Evaluation for Complex Document ProcessingProceedings of the VLDB Endowment (PVLDB), 2024

Shreya Shankar

Tristan Chambers

Eugene Wu

Aditya G. Parameswaran

Eugene Wu

LLMAG

358

16 Oct 2024

Reward-Robust RLHF in LLMs

Yuzi Yan

Xingzhou Lou

Jialian Li

Yiping Zhang

Jian Xie

Chao Yu

Yu Wang

Dong Yan

Yuan Shen

368

18 Sep 2024

Large Language Models are Pattern Matchers: Editing Semi-Structured and Structured Documents with ChatGPT

Irene Weber

KELM AI4MH

206

12 Sep 2024

Longhorn: State Space Models are Amortized Online Learners

422

19 Jul 2024

A Declarative System for Optimizing AI Workloads

Michael Cafarella

237

23 May 2024

Chameleon: Foundation Models for Fairness-aware Multi-modal Data Augmentation to Enhance Coverage of Minorities

Mahdi Erfanian

H. V. Jagadish

Abolfazl Asudeh

174

02 Feb 2024

Gated Linear Attention Transformers with Hardware-Efficient Training

Bailin Wang

443

300

11 Dec 2023

Jellyfish: A Large Language Model for Data Preprocessing

516

04 Dec 2023

SEED: Domain-Specific Data Curation With Large Language Models

Michael Cafarella

270

01 Oct 2023

Generative Benchmark Creation for Table Union Search

170

07 Aug 2023

TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage

...

341

07 Aug 2023