ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.10044
  4. Cited By
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

24 May 2019
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
ArXivPDFHTML

Papers citing "BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions"

50 / 1,036 papers shown
Title
MonoFormer: One Transformer for Both Diffusion and Autoregression
MonoFormer: One Transformer for Both Diffusion and Autoregression
Chuyang Zhao
Yuxing Song
Wenhao Wang
Haocheng Feng
Errui Ding
Yifan Sun
Xinyan Xiao
Jingdong Wang
DiffM
31
17
0
24 Sep 2024
Small Language Models: Survey, Measurements, and Insights
Small Language Models: Survey, Measurements, and Insights
Zhenyan Lu
Xiang Li
Dongqi Cai
Rongjie Yi
Fangming Liu
Xiwen Zhang
Nicholas D. Lane
Mengwei Xu
ObjD
LRM
51
36
0
24 Sep 2024
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in
  Red Teaming GenAI
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI
Ambrish Rawat
Stefan Schoepf
Giulio Zizzo
Giandomenico Cornacchia
Muhammad Zaid Hameed
...
Elizabeth M. Daly
Mark Purcell
P. Sattigeri
Pin-Yu Chen
Kush R. Varshney
AAML
40
7
0
23 Sep 2024
Target-Aware Language Modeling via Granular Data Sampling
Target-Aware Language Modeling via Granular Data Sampling
Ernie Chang
Pin-Jie Lin
Yang Li
Changsheng Zhao
Daeil Kim
Rastislav Rabatin
Zechun Liu
Yangyang Shi
Vikas Chandra
SyDa
46
1
0
23 Sep 2024
Investigating Layer Importance in Large Language Models
Investigating Layer Importance in Large Language Models
Yang Zhang
Yanfei Dong
Kenji Kawaguchi
FAtt
46
6
0
22 Sep 2024
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang
V. Papyan
VLM
43
1
0
20 Sep 2024
AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs
AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs
Basel Mousi
Nadir Durrani
Fatema Ahmad
Md. Arid Hasan
Maram Hasanain
Tameem Kabbani
Fahim Dalvi
Shammur A. Chowdhury
Firoj Alam
43
8
0
17 Sep 2024
Propulsion: Steering LLM with Tiny Fine-Tuning
Propulsion: Steering LLM with Tiny Fine-Tuning
Md. Kowsher
Nusrat Jahan Prottasha
Prakash Bhat
38
4
0
17 Sep 2024
Flash STU: Fast Spectral Transform Units
Flash STU: Fast Spectral Transform Units
Y. Isabel Liu
Windsor Nguyen
Yagiz Devre
Evan Dogariu
Anirudha Majumdar
Elad Hazan
AI4TS
70
1
0
16 Sep 2024
FP-VEC: Fingerprinting Large Language Models via Efficient Vector
  Addition
FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition
Zhenhua Xu
Wenpeng Xing
Zhebo Wang
Chang Hu
Chen Jie
Meng Han
28
1
0
13 Sep 2024
Understanding Foundation Models: Are We Back in 1924?
Understanding Foundation Models: Are We Back in 1924?
Alan F. Smeaton
AI4CE
35
2
0
11 Sep 2024
The AdEMAMix Optimizer: Better, Faster, Older
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini
Pierre Ablin
David Grangier
ODL
28
8
0
05 Sep 2024
Hyper-Compression: Model Compression via Hyperfunction
Hyper-Compression: Model Compression via Hyperfunction
Fenglei Fan
Juntong Fan
Dayang Wang
Jingbo Zhang
Zelin Dong
Shijun Zhang
Ge Wang
Tieyong Zeng
18
0
0
01 Sep 2024
Leveraging Open Knowledge for Advancing Task Expertise in Large Language
  Models
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models
Yuncheng Yang
Yulei Qin
Tong Wu
Zihan Xu
Gang Li
...
Yuchen Shi
Ke Li
Xing Sun
Jie Yang
Yun Gu
ALM
OffRL
MoE
46
0
0
28 Aug 2024
Language Adaptation on a Tight Academic Compute Budget: Tokenizer
  Swapping Works and Pure bfloat16 Is Enough
Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure bfloat16 Is Enough
Konstantin Dobler
Gerard de Melo
50
1
0
28 Aug 2024
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
Maxim Zhelnin
Viktor Moskvoretskii
Egor Shvetsov
Egor Venediktov
Mariya Krylova
Aleksandr Zuev
Evgeny Burnaev
24
2
0
27 Aug 2024
Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate
  Scheduler
Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler
Yikang Shen
Matthew Stallone
Mayank Mishra
Gaoyuan Zhang
Shawn Tan
Aditya Prasad
Adriana Meza Soria
David D. Cox
Rameswar Panda
31
11
0
23 Aug 2024
First Activations Matter: Training-Free Methods for Dynamic Activation
  in Large Language Models
First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models
Chi Ma
Mincong Huang
Ying Zhang
Chao Wang
Yujie Wang
Lei Yu
Chuan Liu
Wei Lin
AI4CE
LLMSV
51
2
0
21 Aug 2024
CoDi: Conversational Distillation for Grounded Question Answering
CoDi: Conversational Distillation for Grounded Question Answering
Patrick Huber
Arash Einolghozati
Rylan Conway
Kanika Narang
Matt Smith
Waqar Nayyar
Adithya Sagar
Ahmed Aly
Akshat Shrivastava
21
0
0
20 Aug 2024
Transfusion: Predict the Next Token and Diffuse Images with One
  Multi-Modal Model
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Chunting Zhou
Lili Yu
Arun Babu
Kushal Tirumala
Michihiro Yasunaga
Leonid Shamis
Jacob Kahn
Xuezhe Ma
Luke Zettlemoyer
Omer Levy
DiffM
28
147
0
20 Aug 2024
To Code, or Not To Code? Exploring Impact of Code in Pre-training
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Viraat Aryabumi
Yixuan Su
Raymond Ma
Adrien Morisot
Ivan Zhang
Acyr F. Locatelli
Marzieh Fadaee
A. Ustun
Sara Hooker
SyDa
AI4CE
40
18
0
20 Aug 2024
HMoE: Heterogeneous Mixture of Experts for Language Modeling
HMoE: Heterogeneous Mixture of Experts for Language Modeling
An Wang
Xingwu Sun
Ruobing Xie
Shuaipeng Li
Jiaqi Zhu
...
J. N. Han
Zhanhui Kang
Di Wang
Naoaki Okazaki
Cheng-zhong Xu
MoE
18
12
0
20 Aug 2024
LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for
  Large Language Models
LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Models
Yupeng Su
Ziyi Guan
Xiaoqun Liu
Tianlai Jin
Dongkuan Wu
G. Chesi
Ngai Wong
Hao Yu
34
1
0
20 Aug 2024
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline
  Reinforcement Learning
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning
Yilun Kong
Hangyu Mao
Qi Zhao
Bin Zhang
Jingqing Ruan
Li Shen
Yongzhe Chang
Xueqian Wang
Rui Zhao
Dacheng Tao
OffRL
34
1
0
20 Aug 2024
ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA
ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA
Jiaang Li
Quan Wang
Zhongnan Wang
Yongdong Zhang
Zhendong Mao
CLL
KELM
25
0
0
19 Aug 2024
MoDeGPT: Modular Decomposition for Large Language Model Compression
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin
Shangqian Gao
James Seale Smith
Abhishek Patel
Shikhar Tuli
Yilin Shen
Hongxia Jin
Yen-Chang Hsu
71
6
0
19 Aug 2024
How Susceptible are LLMs to Influence in Prompts?
How Susceptible are LLMs to Influence in Prompts?
Sotiris Anagnostidis
Jannis Bulian
LRM
38
16
0
17 Aug 2024
See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering
  LLM Weaknesses
See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Yulong Chen
Yang Liu
Jianhao Yan
X. Bai
Ming Zhong
Yinghao Yang
Ziyi Yang
Chenguang Zhu
Yue Zhang
ALM
ELM
35
6
0
16 Aug 2024
Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge
Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge
Ravi Raju
Swayambhoo Jain
Bo Li
Jonathan Li
Urmish Thakker
ALM
ELM
42
11
0
16 Aug 2024
ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large
  Language Models
ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models
Chao Zeng
Songwei Liu
Yusheng Xie
Hong Liu
Xiaojian Wang
Miao Wei
Shu Yang
Fangmin Chen
Xing Mei
MQ
37
6
0
16 Aug 2024
ScalingFilter: Assessing Data Quality through Inverse Utilization of
  Scaling Laws
ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws
Ruihang Li
Yixuan Wei
Miaosen Zhang
Nenghai Yu
Han Hu
Houwen Peng
42
2
0
15 Aug 2024
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large
  Language Models
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models
Zhongyu Zhao
Menghang Dong
Rongyu Zhang
Wenzhao Zheng
Yunpeng Zhang
Huanrui Yang
Dalong Du
Kurt Keutzer
Shanghang Zhang
46
0
0
15 Aug 2024
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative
  Self-Enhancement Paradigm
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm
Yiming Liang
Ge Zhang
Xingwei Qu
Tianyu Zheng
Jiawei Guo
...
Jiaheng Liu
Chenghua Lin
Lei Ma
Wenhao Huang
Jiajun Zhang
ALM
43
5
0
15 Aug 2024
FuseChat: Knowledge Fusion of Chat Models
FuseChat: Knowledge Fusion of Chat Models
Fanqi Wan
Longguang Zhong
Ziyi Yang
Ruijun Chen
Xiaojun Quan
ALM
KELM
MoMe
31
23
0
15 Aug 2024
Large Language Models Prompting With Episodic Memory
Large Language Models Prompting With Episodic Memory
Dai Do
Quan Tran
Svetha Venkatesh
Hung Le
LLMAG
35
1
0
14 Aug 2024
Fast Training Dataset Attribution via In-Context Learning
Fast Training Dataset Attribution via In-Context Learning
Milad Fotouhi
M. T. Bahadori
Oluwaseyi Feyisetan
P. Arabshahi
David Heckerman
31
0
0
14 Aug 2024
Anchored Preference Optimization and Contrastive Revisions: Addressing
  Underspecification in Alignment
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
Karel DÓosterlinck
Winnie Xu
Chris Develder
Thomas Demeester
A. Singh
Christopher Potts
Douwe Kiela
Shikib Mehri
33
10
0
12 Aug 2024
LUT Tensor Core: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference
LUT Tensor Core: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference
Zhiwen Mo
Lei Wang
Jianyu Wei
Zhichen Zeng
Shijie Cao
...
Naifeng Jing
Ting Cao
Jilong Xue
Fan Yang
Mao Yang
54
4
0
12 Aug 2024
Generalisation First, Memorisation Second? Memorisation Localisation for
  Natural Language Classification Tasks
Generalisation First, Memorisation Second? Memorisation Localisation for Natural Language Classification Tasks
Verna Dankers
Ivan Titov
37
5
0
09 Aug 2024
Zero-shot Factual Consistency Evaluation Across Domains
Zero-shot Factual Consistency Evaluation Across Domains
Raunak Agarwal
HILM
39
0
0
07 Aug 2024
A Convex-optimization-based Layer-wise Post-training Pruner for Large
  Language Models
A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models
Pengxiang Zhao
Hanyu Hu
Ping Li
Yi Zheng
Zhefeng Wang
Xiaoming Yuan
36
1
0
07 Aug 2024
SARA: Singular-Value Based Adaptive Low-Rank Adaption
SARA: Singular-Value Based Adaptive Low-Rank Adaption
Jihao Gu
Shuai Chen
Zelin Wang
Yibo Zhang
Ping Gong
30
3
0
06 Aug 2024
A Novel Metric for Measuring the Robustness of Large Language Models in
  Non-adversarial Scenarios
A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial Scenarios
Samuel Ackerman
Ella Rabinovich
E. Farchi
Ateret Anaby-Tavor
28
1
0
04 Aug 2024
Cross-layer Attention Sharing for Large Language Models
Cross-layer Attention Sharing for Large Language Models
Yongyu Mu
Yuzhang Wu
Yuchun Fan
Chenglong Wang
Hengyu Li
Qiaozhi He
Murun Yang
Tong Xiao
Jingbo Zhu
40
5
0
04 Aug 2024
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong
Lujun Li
Dayou Du
Yuhan Chen
Zhenheng Tang
...
Wei Xue
Wenhan Luo
Qi-fei Liu
Yi-Ting Guo
Xiaowen Chu
MQ
45
4
0
03 Aug 2024
Gemma 2: Improving Open Language Models at a Practical Size
Gemma 2: Improving Open Language Models at a Practical Size
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
...
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
VLM
MoE
OSLM
37
651
0
31 Jul 2024
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware
  Experts
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Xi Victoria Lin
Akshat Shrivastava
Liang Luo
Srinivasan Iyer
Mike Lewis
Gargi Gosh
Luke Zettlemoyer
Armen Aghajanyan
MoE
38
20
0
31 Jul 2024
PMoE: Progressive Mixture of Experts with Asymmetric Transformer for
  Continual Learning
PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning
Min Jae Jung
Romain Rouvoy
KELM
MoE
CLL
44
2
0
31 Jul 2024
Pruning Large Language Models with Semi-Structural Adaptive Sparse
  Training
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang
Yuezhou Hu
Guohao Jian
Jun Zhu
Jianfei Chen
35
5
0
30 Jul 2024
Can Editing LLMs Inject Harm?
Can Editing LLMs Inject Harm?
Canyu Chen
Baixiang Huang
Zekun Li
Zhaorun Chen
Shiyang Lai
...
Xifeng Yan
William Wang
Philip H. S. Torr
Dawn Song
Kai Shu
KELM
38
11
0
29 Jul 2024
Previous
123...567...192021
Next