Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.14389
Cited By
Downstream Datasets Make Surprisingly Good Pretraining Corpora
28 September 2022
Kundan Krishna
Saurabh Garg
Jeffrey P. Bigham
Zachary Chase Lipton
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Downstream Datasets Make Surprisingly Good Pretraining Corpora"
8 / 8 papers shown
Title
Utility Theory of Synthetic Data Generation
Shi Xu
W. Sun
Guang Cheng
10
5
0
17 May 2023
How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer to Novel Tasks and Healthcare Systems
Cara Van Uden
Jeremy Irvin
Mars Huang
N. Dean
J. Carr
A. Ng
C. Langlotz
OOD
6
1
0
13 May 2023
Can Wikipedia Help Offline Reinforcement Learning?
Machel Reid
Yutaro Yamada
S. Gu
3DV
RALM
OffRL
118
95
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
7,337
0
11 Nov 2021
NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework
Xingcheng Yao
Yanan Zheng
Xiaocong Yang
Zhilin Yang
24
36
0
07 Nov 2021
Mitigating Language-Dependent Ethnic Bias in BERT
Jaimeen Ahn
Alice H. Oh
116
90
0
13 Sep 2021
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
391
2,216
0
03 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,003
0
20 Apr 2018
1