ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.09433
  4. Cited By
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes
v1v2v3 (latest)

Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes

Proceedings of the VLDB Endowment (PVLDB), 2023
19 April 2023
Simran Arora
Brandon Yang
Sabri Eyuboglu
A. Narayan
Andrew Hojel
Immanuel Trummer
Christopher Ré
    SyDa
ArXiv (abs)PDFHTML

Papers citing "Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes"

50 / 86 papers shown
Title
Structured RAG for Answering Aggregative Questions
Structured RAG for Answering Aggregative Questions
Omri Koshorek
Niv Granot
Aviv Alloni
Shahar Admati
Roee Hendel
Ido Weiss
Alan Arazi
Shay-Nitzan Cohen
Yonatan Belinkov
RALM
215
0
0
11 Nov 2025
SRE-Llama -- Fine-Tuned Meta's Llama LLM, Federated Learning, Blockchain and NFT Enabled Site Reliability Engineering(SRE) Platform for Communication and Networking Software Services
SRE-Llama -- Fine-Tuned Meta's Llama LLM, Federated Learning, Blockchain and NFT Enabled Site Reliability Engineering(SRE) Platform for Communication and Networking Software ServicesInternational Conference on Blockchain Computing and Applications (BCCA), 2025
Eranga Bandara
Safdar H. Bouk
Sachin Shetty
Ravi Mukkamala
A. Rahman
Peter Foytik
Ross Gore
Xueping Liang
Ng Wee Keong
Kasun De Zoysa
48
1
0
11 Nov 2025
Cortex AISQL: A Production SQL Engine for Unstructured Data
Cortex AISQL: A Production SQL Engine for Unstructured Data
Paweł Liskowski
Bowei Chen
Paritosh Aggarwal
Benjamin Han
Boxin Jiang
...
Jay Tayade
Weicheng Zhao
Anupam Datta
Nathan Wiegand
Dimitris Tsirogiannis
70
2
0
10 Nov 2025
Attention and Compression is all you need for Controllably Efficient Language Models
Attention and Compression is all you need for Controllably Efficient Language Models
Jatin Prakash
A. Puli
Rajesh Ranganath
MQVLM
418
0
0
07 Nov 2025
Relational Deep Dive: Error-Aware Queries Over Unstructured Data
Relational Deep Dive: Error-Aware Queries Over Unstructured Data
Daren Chao
Kaiwen Chen
Naiqing Guan
Nick Koudas
98
0
0
04 Nov 2025
AGRAG: Advanced Graph-based Retrieval-Augmented Generation for LLMs
AGRAG: Advanced Graph-based Retrieval-Augmented Generation for LLMs
Y. Wang
Haoyang Li
Fei Teng
Lei Chen
LRM
84
0
0
02 Nov 2025
FlashEVA: Accelerating LLM inference via Efficient Attention
FlashEVA: Accelerating LLM inference via Efficient Attention
Juan Gabriel Kostelec
Qinghai Guo
143
0
0
01 Nov 2025
Standardization of Psychiatric Diagnoses -- Role of Fine-tuned LLM Consortium and OpenAI-gpt-oss Reasoning LLM Enabled Decision Support System
Standardization of Psychiatric Diagnoses -- Role of Fine-tuned LLM Consortium and OpenAI-gpt-oss Reasoning LLM Enabled Decision Support System
Eranga Bandara
Ross Gore
Atmaram Yarlagadda
Anita H. Clayton
Preston Samuel
Christopher Rhea
Sachin Shetty
AI4MH
136
2
0
29 Oct 2025
TEXT2DB: Integration-Aware Information Extraction with Large Language Model Agents
TEXT2DB: Integration-Aware Information Extraction with Large Language Model AgentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Yizhu Jiao
S. Li
Sizhe Zhou
Heng Ji
Jiawei Han
105
8
0
28 Oct 2025
Agentsway -- Software Development Methodology for AI Agents-based Teams
Agentsway -- Software Development Methodology for AI Agents-based Teams
Eranga Bandara
Ross Gore
Xueping Liang
Sachini Rajapakse
Isurunima Kularathne
...
Amin Hass
Ng Wee Keong
Kasun De Zoysa
Aruna Withanage
Nilaan Loganathan
LLMAGAI4TSAIFin
286
2
0
26 Oct 2025
Model Context Contracts - MCP-Enabled Framework to Integrate LLMs With Blockchain Smart Contracts
Model Context Contracts - MCP-Enabled Framework to Integrate LLMs With Blockchain Smart Contracts
Eranga Bandara
Sachin Shetty
Ravi Mukkamala
Ross Gore
Peter Foytik
...
Xueping Liang
Ng Wee Keong
Kasun De Zoysa
Aruna Withanage
Nilaan Loganathan
56
2
0
21 Oct 2025
Implementing Semantic Join Operators Efficiently
Implementing Semantic Join Operators Efficiently
Immanuel Trummer
92
0
0
09 Oct 2025
LLM/Agent-as-Data-Analyst: A Survey
LLM/Agent-as-Data-Analyst: A Survey
Zirui Tang
Weizheng Wang
Z. Zhou
Yang Jiao
Bangrui Xu
...
Conghui He
Bin Wang
Conghui He
Xiaoyang Wang
Fan Wu
178
6
0
28 Sep 2025
ScaleDoc: Scaling LLM-based Predicates over Large Document Collections
ScaleDoc: Scaling LLM-based Predicates over Large Document Collections
Hengrui Zhang
Yulong Hui
Yihao Liu
Huanchen Zhang
OffRL
83
0
0
16 Sep 2025
A Survey on Retrieval And Structuring Augmented Generation with Large Language Models
A Survey on Retrieval And Structuring Augmented Generation with Large Language Models
Pengcheng Jiang
Siru Ouyang
Yizhu Jiao
Ming Zhong
Runchu Tian
Jiawei Han
RALMKELM
140
3
0
12 Sep 2025
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
Sepanta Zeighami
Shreya Shankar
Aditya G. Parameswaran
85
3
0
02 Sep 2025
A Survey on Open Dataset Search in the LLM Era: Retrospectives and Perspectives
A Survey on Open Dataset Search in the LLM Era: Retrospectives and Perspectives
Pengyue Li
Sheng Wang
Hua Dai
Zhiyu Zoey Chen
Z. Bao
Brian D. Davison
74
0
0
31 Aug 2025
ST-Raptor: LLM-Powered Semi-Structured Table Question Answering
ST-Raptor: LLM-Powered Semi-Structured Table Question Answering
Zirui Tang
Boyu Niu
Xuanhe Zhou
Boxiu Li
Wei Zhou
Jiannan Wang
Guoliang Li
Xinyi Zhang
Fan Wu
LMTD
159
2
0
25 Aug 2025
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu
Qinghao Hu
Shang Yang
Haocheng Xi
Junyu Chen
Song Han
Han Cai
164
10
0
21 Aug 2025
LoopServe: An Adaptive Dual-phase LLM Inference Acceleration System for Multi-Turn Dialogues
LoopServe: An Adaptive Dual-phase LLM Inference Acceleration System for Multi-Turn Dialogues
Haoyang Li
Zhanchao Xu
Yiming Li
Xuejia Chen
Darian Li
...
Cheng Deng
Jun Wang
Qing Li
Lei Chen
Mingxuan Yuan
164
1
0
18 Jul 2025
Context-Informed Grounding Supervision
Context-Informed Grounding Supervision
Hyunji Lee
Seunghyun Yoon
Yunjae Won
Hanseok Oh
Geewook Kim
Trung H. Bui
Franck Dernoncourt
Elias Stengel-Eskin
Mohit Bansal
Minjoon Seo
LRM
226
2
0
18 Jun 2025
MesaNet: Sequence Modeling by Locally Optimal Test-Time Training
J. Oswald
Nino Scherrer
Seijin Kobayashi
Luca Versari
Songlin Yang
...
Guillaume Lajoie
Charlotte Frenkel
Razvan Pascanu
Blaise Agüera y Arcas
João Sacramento
269
12
0
05 Jun 2025
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Kazuki Irie
Morris Yau
Samuel J. Gershman
159
3
0
31 May 2025
Towards Scalable Schema Mapping using Large Language Models
Towards Scalable Schema Mapping using Large Language Models
Christopher Buss
Mahdis Safari
Arash Termehchy
Stefan Lee
David Maier
112
3
0
30 May 2025
ATLAS: Learning to Optimally Memorize the Context at Test Time
ATLAS: Learning to Optimally Memorize the Context at Test Time
Ali Behrouz
Zeman Li
Praneeth Kacham
Majid Daliri
Yuan Deng
Peilin Zhong
Meisam Razaviyayn
Vahab Mirrokni
360
22
0
29 May 2025
SQUiD: Synthesizing Relational Databases from Unstructured Text
SQUiD: Synthesizing Relational Databases from Unstructured Text
Mushtari Sadia
Zhenning Yang
Yunming Xiao
Ang Chen
Amrita Roy Chowdhury
SyDa
197
1
0
25 May 2025
How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation
How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation
Xin Lu
Yanyan Zhao
Si Wei
Shijin Wang
Bing Qin
Ting Liu
156
0
0
24 May 2025
Efficient LLM Serving on Hybrid Real-time and Best-effort Requests
Efficient LLM Serving on Hybrid Real-time and Best-effort Requests
Wan Borui
Zhao Juntao
Jiang Chenyu
Guo Chuanxiong
Wu Chuan
VLM
258
7
0
13 Apr 2025
Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous Databases
Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous DatabasesIEEE International Conference on Data Engineering (ICDE), 2025
Teng Lin
244
2
0
08 Apr 2025
LLM-Aided Customizable Profiling of Code Data Based On Programming Language Concepts
LLM-Aided Customizable Profiling of Code Data Based On Programming Language Concepts
Pankaj Thorat
Adnan Qidwai
Adrija Dhar
Aishwariya Chakraborty
Anand Eswaran
Hima Patel
Praveen Jayachandran
178
1
0
19 Mar 2025
Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models
Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models
A. Narayan
D. Biderman
Sabri Eyuboglu
Avner May
Scott W. Linderman
James Zou
Christopher Ré
229
9
0
21 Feb 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu Cheng
KELM
450
14
0
19 Feb 2025
Graph-based Retrieval Augmented Generation for Dynamic Few-shot Text Classification
Graph-based Retrieval Augmented Generation for Dynamic Few-shot Text Classification
Yubo Wang
Haoyang Li
Fei Teng
Lei Chen
406
3
0
17 Feb 2025
CodeMonkeys: Scaling Test-Time Compute for Software Engineering
CodeMonkeys: Scaling Test-Time Compute for Software Engineering
Ryan Ehrlich
Bradley Brown
Jordan Juravsky
Ronald Clark
Christopher Ré
Azalia Mirhoseini
271
25
0
24 Jan 2025
Mind the Data Gap: Bridging LLMs to Enterprise Data Integration
Mind the Data Gap: Bridging LLMs to Enterprise Data Integration
Moe Kayali
Fabian Wenz
Nesime Tatbul
Çağatay Demiralp
167
5
0
31 Dec 2024
The Design of an LLM-powered Unstructured Analytics System
The Design of an LLM-powered Unstructured Analytics System
Eric Anderson
Jonathan Fritz
Austin Lee
Bohou Li
Mark Lindblad
...
Mehul A. Shah
Benjamin Sowell
Dan Tecuci
Vinayak Thapliyal
Matt Welsh
256
27
0
31 Dec 2024
Smoothie: Label Free Language Model Routing
Smoothie: Label Free Language Model RoutingNeural Information Processing Systems (NeurIPS), 2024
Neel Guha
Mayee F. Chen
Trevor Chow
Ishan S. Khare
Christopher Ré
231
19
0
06 Dec 2024
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Unlocking State-Tracking in Linear RNNs Through Negative EigenvaluesInternational Conference on Learning Representations (ICLR), 2024
Riccardo Grazzi
Julien N. Siems
Jörg Franke
Arber Zela
Katharina Eggensperger
Massimiliano Pontil
660
42
0
19 Nov 2024
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
DocETL: Agentic Query Rewriting and Evaluation for Complex Document ProcessingProceedings of the VLDB Endowment (PVLDB), 2024
Shreya Shankar
Tristan Chambers
Eugene Wu
Aditya G. Parameswaran
Eugene Wu
LLMAG
261
28
0
16 Oct 2024
Reward-Robust RLHF in LLMs
Reward-Robust RLHF in LLMs
Yuzi Yan
Xingzhou Lou
Jialian Li
Yiping Zhang
Jian Xie
Chao Yu
Yu Wang
Dong Yan
Yuan Shen
308
17
0
18 Sep 2024
Large Language Models are Pattern Matchers: Editing Semi-Structured and
  Structured Documents with ChatGPT
Large Language Models are Pattern Matchers: Editing Semi-Structured and Structured Documents with ChatGPT
Irene Weber
KELMAI4MH
181
1
0
12 Sep 2024
Longhorn: State Space Models are Amortized Online Learners
Longhorn: State Space Models are Amortized Online Learners
Bo Liu
Rui Wang
Lemeng Wu
Yihao Feng
Peter Stone
Qian Liu
343
28
0
19 Jul 2024
A Declarative System for Optimizing AI Workloads
A Declarative System for Optimizing AI Workloads
Chunwei Liu
Matthew Russo
Michael Cafarella
Lei Cao
Peter Baille Chen
Zui Chen
Michael Franklin
Tim Kraska
Samuel Madden
Gerardo Vitagliano
185
45
0
23 May 2024
Chameleon: Foundation Models for Fairness-aware Multi-modal Data
  Augmentation to Enhance Coverage of Minorities
Chameleon: Foundation Models for Fairness-aware Multi-modal Data Augmentation to Enhance Coverage of Minorities
Mahdi Erfanian
H. V. Jagadish
Abolfazl Asudeh
149
6
0
02 Feb 2024
Gated Linear Attention Transformers with Hardware-Efficient Training
Gated Linear Attention Transformers with Hardware-Efficient Training
Aaron Courville
Bailin Wang
Songlin Yang
Yikang Shen
Yoon Kim
337
291
0
11 Dec 2023
Jellyfish: A Large Language Model for Data Preprocessing
Jellyfish: A Large Language Model for Data Preprocessing
Haochen Zhang
Yuyang Dong
Chuan Xiao
Masafumi Oyamada
436
36
0
04 Dec 2023
SEED: Domain-Specific Data Curation With Large Language Models
SEED: Domain-Specific Data Curation With Large Language Models
Zui Chen
Lei Cao
Samuel Madden
Tim Kraska
Zeyuan Shang
Ju Fan
Nan Tang
Zihui Gu
Chunwei Liu
Michael Cafarella
230
12
0
01 Oct 2023
Generative Benchmark Creation for Table Union Search
Generative Benchmark Creation for Table Union Search
Koyena Pal
Aamod Khatiwada
Roee Shraga
Renée J. Miller
153
2
0
07 Aug 2023
TPTU: Large Language Model-based AI Agents for Task Planning and Tool
  Usage
TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage
Jingqing Ruan
Yihong Chen
Bin Zhang
Zhiwei Xu
Tianpeng Bao
...
Shiwei Shi
Hangyu Mao
Ziyue Li
Xingyu Zeng
Rui Zhao
LLMAGLM&Ro
245
49
0
07 Aug 2023
Embedding-based Retrieval with LLM for Effective Agriculture Information
  Extracting from Unstructured Data
Embedding-based Retrieval with LLM for Effective Agriculture Information Extracting from Unstructured Data
Ruoling Peng
Kang Liu
Po Yang
Zhipeng Yuan
Shunbao Li
137
41
0
06 Aug 2023
12
Next