Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2304.09433
Cited By
v1
v2
v3 (latest)
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes
Proceedings of the VLDB Endowment (PVLDB), 2023
19 April 2023
Simran Arora
Brandon Yang
Sabri Eyuboglu
A. Narayan
Andrew Hojel
Immanuel Trummer
Christopher Ré
SyDa
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes"
37 / 87 papers shown
Title
Embedding-based Retrieval with LLM for Effective Agriculture Information Extracting from Unstructured Data
Ruoling Peng
Kang Liu
Po Yang
Zhipeng Yuan
Shunbao Li
153
42
0
06 Aug 2023
CHORUS: Foundation Models for Unified Data Discovery and Exploration
Proceedings of the VLDB Endowment (PVLDB), 2023
Moe Kayali
A. Lykov
Ilias Fountalis
N. Vasiloglou
Dan Olteanu
Dan Suciu
236
41
0
16 Jun 2023
Large Language Models as Tool Makers
International Conference on Learning Representations (ICLR), 2023
Tianle Cai
Xuezhi Wang
Tengyu Ma
Xinyun Chen
Denny Zhou
LLMAG
256
257
0
26 May 2023
Enabling and Analyzing How to Efficiently Extract Information from Hybrid Long Documents with LLMs
C. Yue
Xinru Xu
Xiaojun Ma
Lun Du
Hengyu Liu
Zhiming Ding
Yanbing Jiang
Shi Han
Dongmei Zhang
143
4
0
24 May 2023
From Words to Code: Harnessing Data for Program Synthesis from Natural Language
Anirudh Khatry
Joyce Cahoon
Jordan Henkel
Shaleen Deep
Venkatesh Emani
...
Vu Le
Mohammad Raza
Sherry Shi
Mukul Singh
A. Tiwari
218
16
0
02 May 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
International Conference on Machine Learning (ICML), 2023
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Abigail Z. Jacobs
Christopher Ré
Ion Stoica
Ce Zhang
412
553
0
13 Mar 2023
Construction of Knowledge Graphs: State and Challenges
Marvin Hofer
Daniel Obraczka
A. Saeedi
Hanna Köpcke
Erhard Rahm
282
60
0
22 Feb 2023
ChatGPT: Jack of all trades, master of none
Information Fusion (Inf. Fusion), 2023
Jan Kocoñ
Igor Cichecki
Oliwier Kaszyca
Mateusz Kochanek
Dominika Szydło
...
Maciej Piasecki
Lukasz Radliñski
Konrad Wojtasik
Stanislaw Wo'zniak
Przemyslaw Kazienko
AI4MH
505
669
0
21 Feb 2023
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
Omar Khattab
Keshav Santhanam
Xiang Lisa Li
David Leo Wright Hall
Abigail Z. Jacobs
Christopher Potts
Matei A. Zaharia
RALM
KELM
238
335
0
28 Dec 2022
DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation
International Conference on Machine Learning (ICML), 2022
Yuhang Lai
Chengxi Li
Yiming Wang
Tianyi Zhang
Ruiqi Zhong
Luke Zettlemoyer
Scott Yih
Daniel Fried
Si-yi Wang
Tao Yu
ELM
ALM
271
437
0
18 Nov 2022
Ask Me Anything: A simple strategy for prompting language models
International Conference on Learning Representations (ICLR), 2022
Simran Arora
A. Narayan
Mayee F. Chen
Laurel J. Orr
Neel Guha
Kush S. Bhatia
Ines Chami
Frederic Sala
Christopher Ré
ReLM
LRM
607
253
0
05 Oct 2022
Operationalizing Machine Learning: An Interview Study
Shreya Shankar
Rolando Garcia
J. M. Hellerstein
Aditya G. Parameswaran
199
61
0
16 Sep 2022
Large Language Models are Few-Shot Clinical Information Extractors
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Monica Agrawal
S. Hegselmann
Hunter Lang
Yoon Kim
David Sontag
BDL
LM&MA
556
419
0
25 May 2022
A Survey on Neural Open Information Extraction: Current Status and Future Directions
International Joint Conference on Artificial Intelligence (IJCAI), 2022
Shaowen Zhou
Yu Bowen
Aixin Sun
Cheng Long
Jingyang Li
Haiyang Yu
Jianguo Sun
Yongbin Li
211
40
0
24 May 2022
Can Foundation Models Wrangle Your Data?
Proceedings of the VLDB Endowment (PVLDB), 2022
A. Narayan
Ines Chami
Laurel J. Orr
Simran Arora
Christopher Ré
LMTD
AI4CE
420
281
0
20 May 2022
Language Models in the Loop: Incorporating Prompting into Weak Supervision
ACM / IMS Journal of Data Science (JDS), 2022
Ryan Smith
Jason Alan Fries
Braden Hancock
Stephen H. Bach
274
61
0
04 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
International Conference on Learning Representations (ICLR), 2022
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
1.9K
5,363
0
21 Mar 2022
Reasoning over Public and Private Data in Retrieval-Based Systems
Transactions of the Association for Computational Linguistics (TACL), 2022
Simran Arora
Patrick Lewis
Angela Fan
Jacob Kahn
Christopher Ré
160
29
0
14 Mar 2022
Training language models to follow instructions with human feedback
Neural Information Processing Systems (NeurIPS), 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
2.0K
17,148
0
04 Mar 2022
A Survey on Retrieval-Augmented Text Generation
Huayang Li
Yixuan Su
Deng Cai
Yan Wang
Lemao Liu
RALM
343
258
0
02 Feb 2022
DOM-LM: Learning Generalizable Representations for HTML Documents
Xiang Deng
Prashant Shiralkar
Colin Lockard
Binxuan Huang
Huan Sun
AI4TS
AI4CE
215
42
0
25 Jan 2022
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
396
960
0
01 Dec 2021
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts
Tongshuang Wu
Michael Terry
Carrie J. Cai
LLMAG
AI4CE
LRM
333
565
0
04 Oct 2021
Can Deep Neural Networks Predict Data Correlations from Column Names?
Immanuel Trummer
157
8
0
09 Jul 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
814
2,517
0
31 Dec 2020
Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling
International Conference on Learning Representations (ICLR), 2020
Benedikt Boecking
Willie Neiswanger
Eric Xing
A. Dubrawski
NoLa
OffRL
339
75
0
11 Dec 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
587
3,349
0
05 Jun 2020
Language Models are Few-Shot Learners
Neural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.0K
51,682
0
28 May 2020
ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages
Colin Lockard
Prashant Shiralkar
Xin Luna Dong
Hannaneh Hajishirzi
115
60
0
14 May 2020
Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods
International Conference on Machine Learning (ICML), 2020
Daniel Y. Fu
Mayee F. Chen
Frederic Sala
Sarah Hooper
Kayvon Fatahalian
Christopher Ré
OffRL
268
125
0
27 Feb 2020
Learning Dependency Structures for Weak Supervision Models
P. Varma
Frederic Sala
A. He
Alexander Ratner
Christopher Ré
NoLa
178
68
0
14 Mar 2019
A Survey on Open Information Extraction
C. Niklaus
Matthias Cetto
André Freitas
Siegfried Handschuh
131
194
0
14 Jun 2018
Neural Open Information Extraction
Lei Cui
Furu Wei
M. Zhou
165
161
0
11 May 2018
Snorkel: Rapid Training Data Creation with Weak Supervision
Alexander Ratner
Stephen H. Bach
Henry R. Ehrenberg
Jason Alan Fries
Sen Wu
Christopher Ré
306
1,083
0
28 Nov 2017
Reading Wikipedia to Answer Open-Domain Questions
Danqi Chen
Adam Fisch
Jason Weston
Antoine Bordes
RALM
376
2,138
0
31 Mar 2017
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Abigail Z. Jacobs
RALM
676
8,866
0
16 Jun 2016
Incremental Knowledge Base Construction Using DeepDive
Jaeho Shin
Sen Wu
Feiran Wang
Christopher De Sa
Ce Zhang
Christopher Ré
CLL
HAI
392
292
0
03 Feb 2015
Previous
1
2