ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.09433
  4. Cited By
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes
v1v2v3 (latest)

Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes

Proceedings of the VLDB Endowment (PVLDB), 2023
19 April 2023
Simran Arora
Brandon Yang
Sabri Eyuboglu
A. Narayan
Andrew Hojel
Immanuel Trummer
Christopher Ré
    SyDa
ArXiv (abs)PDFHTML

Papers citing "Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes"

37 / 87 papers shown
Title
Embedding-based Retrieval with LLM for Effective Agriculture Information
  Extracting from Unstructured Data
Embedding-based Retrieval with LLM for Effective Agriculture Information Extracting from Unstructured Data
Ruoling Peng
Kang Liu
Po Yang
Zhipeng Yuan
Shunbao Li
153
42
0
06 Aug 2023
CHORUS: Foundation Models for Unified Data Discovery and Exploration
CHORUS: Foundation Models for Unified Data Discovery and ExplorationProceedings of the VLDB Endowment (PVLDB), 2023
Moe Kayali
A. Lykov
Ilias Fountalis
N. Vasiloglou
Dan Olteanu
Dan Suciu
236
41
0
16 Jun 2023
Large Language Models as Tool Makers
Large Language Models as Tool MakersInternational Conference on Learning Representations (ICLR), 2023
Tianle Cai
Xuezhi Wang
Tengyu Ma
Xinyun Chen
Denny Zhou
LLMAG
256
257
0
26 May 2023
Enabling and Analyzing How to Efficiently Extract Information from
  Hybrid Long Documents with LLMs
Enabling and Analyzing How to Efficiently Extract Information from Hybrid Long Documents with LLMs
C. Yue
Xinru Xu
Xiaojun Ma
Lun Du
Hengyu Liu
Zhiming Ding
Yanbing Jiang
Shi Han
Dongmei Zhang
143
4
0
24 May 2023
From Words to Code: Harnessing Data for Program Synthesis from Natural
  Language
From Words to Code: Harnessing Data for Program Synthesis from Natural Language
Anirudh Khatry
Joyce Cahoon
Jordan Henkel
Shaleen Deep
Venkatesh Emani
...
Vu Le
Mohammad Raza
Sherry Shi
Mukul Singh
A. Tiwari
218
16
0
02 May 2023
FlexGen: High-Throughput Generative Inference of Large Language Models
  with a Single GPU
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPUInternational Conference on Machine Learning (ICML), 2023
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Abigail Z. Jacobs
Christopher Ré
Ion Stoica
Ce Zhang
412
553
0
13 Mar 2023
Construction of Knowledge Graphs: State and Challenges
Construction of Knowledge Graphs: State and Challenges
Marvin Hofer
Daniel Obraczka
A. Saeedi
Hanna Köpcke
Erhard Rahm
282
60
0
22 Feb 2023
ChatGPT: Jack of all trades, master of none
ChatGPT: Jack of all trades, master of noneInformation Fusion (Inf. Fusion), 2023
Jan Kocoñ
Igor Cichecki
Oliwier Kaszyca
Mateusz Kochanek
Dominika Szydło
...
Maciej Piasecki
Lukasz Radliñski
Konrad Wojtasik
Stanislaw Wo'zniak
Przemyslaw Kazienko
AI4MH
505
669
0
21 Feb 2023
Demonstrate-Search-Predict: Composing retrieval and language models for
  knowledge-intensive NLP
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
Omar Khattab
Keshav Santhanam
Xiang Lisa Li
David Leo Wright Hall
Abigail Z. Jacobs
Christopher Potts
Matei A. Zaharia
RALMKELM
238
335
0
28 Dec 2022
DS-1000: A Natural and Reliable Benchmark for Data Science Code
  Generation
DS-1000: A Natural and Reliable Benchmark for Data Science Code GenerationInternational Conference on Machine Learning (ICML), 2022
Yuhang Lai
Chengxi Li
Yiming Wang
Tianyi Zhang
Ruiqi Zhong
Luke Zettlemoyer
Scott Yih
Daniel Fried
Si-yi Wang
Tao Yu
ELMALM
271
437
0
18 Nov 2022
Ask Me Anything: A simple strategy for prompting language models
Ask Me Anything: A simple strategy for prompting language modelsInternational Conference on Learning Representations (ICLR), 2022
Simran Arora
A. Narayan
Mayee F. Chen
Laurel J. Orr
Neel Guha
Kush S. Bhatia
Ines Chami
Frederic Sala
Christopher Ré
ReLMLRM
607
253
0
05 Oct 2022
Operationalizing Machine Learning: An Interview Study
Operationalizing Machine Learning: An Interview Study
Shreya Shankar
Rolando Garcia
J. M. Hellerstein
Aditya G. Parameswaran
199
61
0
16 Sep 2022
Large Language Models are Few-Shot Clinical Information Extractors
Large Language Models are Few-Shot Clinical Information ExtractorsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Monica Agrawal
S. Hegselmann
Hunter Lang
Yoon Kim
David Sontag
BDLLM&MA
556
419
0
25 May 2022
A Survey on Neural Open Information Extraction: Current Status and
  Future Directions
A Survey on Neural Open Information Extraction: Current Status and Future DirectionsInternational Joint Conference on Artificial Intelligence (IJCAI), 2022
Shaowen Zhou
Yu Bowen
Aixin Sun
Cheng Long
Jingyang Li
Haiyang Yu
Jianguo Sun
Yongbin Li
211
40
0
24 May 2022
Can Foundation Models Wrangle Your Data?
Can Foundation Models Wrangle Your Data?Proceedings of the VLDB Endowment (PVLDB), 2022
A. Narayan
Ines Chami
Laurel J. Orr
Simran Arora
Christopher Ré
LMTDAI4CE
420
281
0
20 May 2022
Language Models in the Loop: Incorporating Prompting into Weak
  Supervision
Language Models in the Loop: Incorporating Prompting into Weak SupervisionACM / IMS Journal of Data Science (JDS), 2022
Ryan Smith
Jason Alan Fries
Braden Hancock
Stephen H. Bach
274
61
0
04 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLMBDLLRMAI4CE
1.9K
5,363
0
21 Mar 2022
Reasoning over Public and Private Data in Retrieval-Based Systems
Reasoning over Public and Private Data in Retrieval-Based SystemsTransactions of the Association for Computational Linguistics (TACL), 2022
Simran Arora
Patrick Lewis
Angela Fan
Jacob Kahn
Christopher Ré
160
29
0
14 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedbackNeural Information Processing Systems (NeurIPS), 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
2.0K
17,148
0
04 Mar 2022
A Survey on Retrieval-Augmented Text Generation
A Survey on Retrieval-Augmented Text Generation
Huayang Li
Yixuan Su
Deng Cai
Yan Wang
Lemao Liu
RALM
343
258
0
02 Feb 2022
DOM-LM: Learning Generalizable Representations for HTML Documents
DOM-LM: Learning Generalizable Representations for HTML Documents
Xiang Deng
Prashant Shiralkar
Colin Lockard
Binxuan Huang
Huan Sun
AI4TSAI4CE
215
42
0
25 Jan 2022
A General Language Assistant as a Laboratory for Alignment
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
396
960
0
01 Dec 2021
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining
  Large Language Model Prompts
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts
Tongshuang Wu
Michael Terry
Carrie J. Cai
LLMAGAI4CELRM
333
565
0
04 Oct 2021
Can Deep Neural Networks Predict Data Correlations from Column Names?
Can Deep Neural Networks Predict Data Correlations from Column Names?
Immanuel Trummer
157
8
0
09 Jul 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
814
2,517
0
31 Dec 2020
Interactive Weak Supervision: Learning Useful Heuristics for Data
  Labeling
Interactive Weak Supervision: Learning Useful Heuristics for Data LabelingInternational Conference on Learning Representations (ICLR), 2020
Benedikt Boecking
Willie Neiswanger
Eric Xing
A. Dubrawski
NoLaOffRL
339
75
0
11 Dec 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
587
3,349
0
05 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.0K
51,682
0
28 May 2020
ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured
  Webpages
ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages
Colin Lockard
Prashant Shiralkar
Xin Luna Dong
Hannaneh Hajishirzi
115
60
0
14 May 2020
Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods
Fast and Three-rious: Speeding Up Weak Supervision with Triplet MethodsInternational Conference on Machine Learning (ICML), 2020
Daniel Y. Fu
Mayee F. Chen
Frederic Sala
Sarah Hooper
Kayvon Fatahalian
Christopher Ré
OffRL
268
125
0
27 Feb 2020
Learning Dependency Structures for Weak Supervision Models
Learning Dependency Structures for Weak Supervision Models
P. Varma
Frederic Sala
A. He
Alexander Ratner
Christopher Ré
NoLa
178
68
0
14 Mar 2019
A Survey on Open Information Extraction
A Survey on Open Information Extraction
C. Niklaus
Matthias Cetto
André Freitas
Siegfried Handschuh
131
194
0
14 Jun 2018
Neural Open Information Extraction
Neural Open Information Extraction
Lei Cui
Furu Wei
M. Zhou
165
161
0
11 May 2018
Snorkel: Rapid Training Data Creation with Weak Supervision
Snorkel: Rapid Training Data Creation with Weak Supervision
Alexander Ratner
Stephen H. Bach
Henry R. Ehrenberg
Jason Alan Fries
Sen Wu
Christopher Ré
306
1,083
0
28 Nov 2017
Reading Wikipedia to Answer Open-Domain Questions
Reading Wikipedia to Answer Open-Domain Questions
Danqi Chen
Adam Fisch
Jason Weston
Antoine Bordes
RALM
376
2,138
0
31 Mar 2017
SQuAD: 100,000+ Questions for Machine Comprehension of Text
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Abigail Z. Jacobs
RALM
676
8,866
0
16 Jun 2016
Incremental Knowledge Base Construction Using DeepDive
Incremental Knowledge Base Construction Using DeepDive
Jaeho Shin
Sen Wu
Feiran Wang
Christopher De Sa
Ce Zhang
Christopher Ré
CLLHAI
392
292
0
03 Feb 2015
Previous
12