ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.07328
  4. Cited By
Adversarial Examples for Evaluating Reading Comprehension Systems

Adversarial Examples for Evaluating Reading Comprehension Systems

23 July 2017
Robin Jia
Percy Liang
    AAML
    ELM
ArXivPDFHTML

Papers citing "Adversarial Examples for Evaluating Reading Comprehension Systems"

50 / 890 papers shown
Title
FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation
FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation
Yulia Otmakhova
Hung Thinh Truong
Rahmad Mahendra
Zenan Zhai
Rongxin Zhu
Daniel Beck
Jey Han Lau
ELM
70
0
0
24 Apr 2025
aiXamine: Simplified LLM Safety and Security
aiXamine: Simplified LLM Safety and Security
Fatih Deniz
Dorde Popovic
Yazan Boshmaf
Euisuh Jeong
M. Ahmad
Sanjay Chawla
Issa M. Khalil
ELM
80
0
0
21 Apr 2025
Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions
Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions
Wang Zhu
Tianqi Chen
Ching Ying Lin
Jade Law
Mazen Jizzini
Jorge J. Nieva
Ruishan Liu
Robin Jia
34
0
0
15 Apr 2025
QAVA: Query-Agnostic Visual Attack to Large Vision-Language Models
QAVA: Query-Agnostic Visual Attack to Large Vision-Language Models
Yudong Zhang
Ruobing Xie
Jiansheng Chen
Xingchen Sun
Zhanhui Kang
Yu Wang
AAML
31
0
0
15 Apr 2025
On the Robustness of GUI Grounding Models Against Image Attacks
On the Robustness of GUI Grounding Models Against Image Attacks
Haoren Zhao
Tianyi Chen
Zhen Wang
AAML
36
0
0
07 Apr 2025
When is dataset cartography ineffective? Using training dynamics does not improve robustness against Adversarial SQuAD
When is dataset cartography ineffective? Using training dynamics does not improve robustness against Adversarial SQuAD
Paul K. Mandal
AAML
66
0
0
24 Mar 2025
Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
Ailin Deng
Tri Cao
Zhirui Chen
Bryan Hooi
VLM
98
2
0
04 Mar 2025
Shh, don't say that! Domain Certification in LLMs
Shh, don't say that! Domain Certification in LLMs
Cornelius Emde
Alasdair Paren
Preetham Arvind
Maxime Kayser
Tom Rainforth
Thomas Lukasiewicz
Guohao Li
Philip H. S. Torr
Adel Bibi
53
1
0
26 Feb 2025
MAGE: Multi-Head Attention Guided Embeddings for Low Resource Sentiment Classification
MAGE: Multi-Head Attention Guided Embeddings for Low Resource Sentiment Classification
Varun Vashisht
Shri Kiran Srinivasan
Mihir Konduskar
Jaskaran Singh Walia
Vukosi Marivate
47
0
0
25 Feb 2025
Pay Attention to Real World Perturbations! Natural Robustness Evaluation in Machine Reading Comprehension
Pay Attention to Real World Perturbations! Natural Robustness Evaluation in Machine Reading Comprehension
Yulong Wu
Viktor Schlegel
R. Batista-Navarro
AAML
36
0
0
23 Feb 2025
Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores
Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores
Jamshid Mozafari
Abdelrahman Abdallah
Bhawna Piryani
Adam Jatowt
47
0
0
22 Feb 2025
Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge
Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge
Daniel Tamayo
Aitor Gonzalez-Agirre
Javier Hernando
Marta Villegas
KELM
93
3
0
04 Feb 2025
Plan-Then-Execute: An Empirical Study of User Trust and Team Performance When Using LLM Agents As A Daily Assistant
Plan-Then-Execute: An Empirical Study of User Trust and Team Performance When Using LLM Agents As A Daily Assistant
Gaole He
Gianluca Demartini
U. Gadiraju
LLMAG
65
8
0
03 Feb 2025
"I am bad": Interpreting Stealthy, Universal and Robust Audio Jailbreaks in Audio-Language Models
"I am bad": Interpreting Stealthy, Universal and Robust Audio Jailbreaks in Audio-Language Models
Isha Gupta
David Khachaturov
Robert D. Mullins
AAML
AuLLM
65
1
0
02 Feb 2025
Assessing and Enhancing the Robustness of Large Language Models with Task Structure Variations for Logical Reasoning
Assessing and Enhancing the Robustness of Large Language Models with Task Structure Variations for Logical Reasoning
Qiming Bao
Gael Gendron
A. Peng
Wanjun Zhong
N. Tan
Yang Chen
Michael Witbrock
Jiaheng Liu
LRM
ELM
68
2
0
20 Jan 2025
Differentiable Adversarial Attacks for Marked Temporal Point Processes
Differentiable Adversarial Attacks for Marked Temporal Point Processes
Pritish Chakraborty
Vinayak Gupta
R. Raj
Srikanta J. Bedathur
A. De
AAML
186
0
0
17 Jan 2025
On the uncertainty principle of neural networks
On the uncertainty principle of neural networks
Jun-Jie Zhang
Dong-xiao Zhang
Jian-Nan Chen
L. Pang
Deyu Meng
57
2
0
17 Jan 2025
FlippedRAG: Black-Box Opinion Manipulation Adversarial Attacks to Retrieval-Augmented Generation Models
FlippedRAG: Black-Box Opinion Manipulation Adversarial Attacks to Retrieval-Augmented Generation Models
Zhuo Chen
Jiawei Liu
Miaokun Chen
Haotan Liu
Qikai Cheng
Qikai Cheng
Fan Zhang
Wei Lu
Xiaozhong Liu
Xiaofeng Wang
AAML
44
1
0
06 Jan 2025
Adversarial Robustness through Dynamic Ensemble Learning
Adversarial Robustness through Dynamic Ensemble Learning
Hetvi Waghela
Jaydip Sen
Sneha Rakshit
AAML
85
0
0
20 Dec 2024
What makes a good metric? Evaluating automatic metrics for text-to-image
  consistency
What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Candace Ross
Melissa Hall
Adriana Romero Soriano
Adina Williams
90
3
0
18 Dec 2024
Adversarial Hubness in Multi-Modal Retrieval
Adversarial Hubness in Multi-Modal Retrieval
Tingwei Zhang
Fnu Suya
Rishi Jha
Collin Zhang
Vitaly Shmatikov
AAML
83
1
0
18 Dec 2024
Pay Attention to the Robustness of Chinese Minority Language Models!
  Syllable-level Textual Adversarial Attack on Tibetan Script
Pay Attention to the Robustness of Chinese Minority Language Models! Syllable-level Textual Adversarial Attack on Tibetan Script
Xi Cao
Dolma Dawa
Nuo Qun
Trashi Nyima
AAML
91
3
0
03 Dec 2024
The Master-Slave Encoder Model for Improving Patent Text Summarization:
  A New Approach to Combining Specifications and Claims
The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims
Shu Zhou
Xin Wang
Zhengda Zhou
Haohan Yi
Xuhui Zheng
Hao Wan
77
1
0
21 Nov 2024
IAE: Irony-based Adversarial Examples for Sentiment Analysis Systems
IAE: Irony-based Adversarial Examples for Sentiment Analysis Systems
Xiaoyin Yi
Jiacheng Huang
AAML
59
0
0
12 Nov 2024
Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal
  from Images
Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images
Arka Daw
Megan Hong-Thanh Chung
Maria Mahbub
Amir Sadovnik
AAML
39
0
0
16 Oct 2024
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
Xiaosen Zheng
Tianyu Pang
Chao Du
Qian Liu
Jing Jiang
Min-Bin Lin
41
8
0
09 Oct 2024
TaeBench: Improving Quality of Toxic Adversarial Examples
TaeBench: Improving Quality of Toxic Adversarial Examples
Xuan Zhu
Dmitriy Bespalov
Liwen You
Ninad Kulkarni
Yanjun Qi
AAML
63
0
0
08 Oct 2024
ECon: On the Detection and Resolution of Evidence Conflicts
ECon: On the Detection and Resolution of Evidence Conflicts
Cheng Jiayang
Chunkit Chan
Qianqian Zhuang
Lin Qiu
Tianhang Zhang
Tengxiao Liu
Yangqiu Song
Yue Zhang
Pengfei Liu
Zheng Zhang
38
1
0
05 Oct 2024
Gamified crowd-sourcing of high-quality data for visual fine-tuning
Gamified crowd-sourcing of high-quality data for visual fine-tuning
Shashank Yadav
Rohan Tomar
Garvit Jain
Chirag Ahooja
Shubham Chaudhary
Charles Elkan
33
0
0
05 Oct 2024
Towards Robust Extractive Question Answering Models: Rethinking the
  Training Methodology
Towards Robust Extractive Question Answering Models: Rethinking the Training Methodology
Son Quoc Tran
Matt Kretchmar
OOD
19
0
0
29 Sep 2024
Responsible AI in Open Ecosystems: Reconciling Innovation with Risk
  Assessment and Disclosure
Responsible AI in Open Ecosystems: Reconciling Innovation with Risk Assessment and Disclosure
Mahasweta Chakraborti
Bert Joseph Prestoza
Nicholas Vincent
Seth Frey
39
1
0
27 Sep 2024
DARE: Diverse Visual Question Answering with Robustness Evaluation
DARE: Diverse Visual Question Answering with Robustness Evaluation
Hannah Sterz
Jonas Pfeiffer
Ivan Vulić
OOD
VLM
21
2
0
26 Sep 2024
Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut
  Learning in Text Classification by Language Models
Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models
Yuqing Zhou
Ruixiang Tang
Ziyu Yao
Ziwei Zhu
36
2
0
26 Sep 2024
Unveiling Narrative Reasoning Limits of Large Language Models with Trope
  in Movie Synopses
Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Hung-Ting Su
Ya-Ching Hsu
Xudong Lin
Xiang Qian Shi
Yulei Niu
Han-Yuan Hsu
Hung-yi Lee
Winston H. Hsu
LRM
36
0
0
22 Sep 2024
Contextual Breach: Assessing the Robustness of Transformer-based QA
  Models
Contextual Breach: Assessing the Robustness of Transformer-based QA Models
Asir Saadat
Nahian Ibn Asad
Md Farhan Ishmam
AAML
43
0
0
17 Sep 2024
LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Nathaniel Li
Ziwen Han
Ian Steneker
Willow Primack
Riley Goodside
Hugh Zhang
Zifan Wang
Cristina Menghini
Summer Yue
AAML
MU
46
40
0
27 Aug 2024
Adversarial Attack for Explanation Robustness of Rationalization Models
Adversarial Attack for Explanation Robustness of Rationalization Models
Yuankai Zhang
Lingxiao Kong
Haozhao Wang
Ruixuan Li
Jun Wang
Yuhua Li
Wei Liu
AAML
35
1
0
20 Aug 2024
Investigating a Benchmark for Training-set free Evaluation of Linguistic
  Capabilities in Machine Reading Comprehension
Investigating a Benchmark for Training-set free Evaluation of Linguistic Capabilities in Machine Reading Comprehension
Viktor Schlegel
Goran Nenadic
R. Batista-Navarro
ELM
32
0
0
09 Aug 2024
Optimal and efficient text counterfactuals using Graph Neural Networks
Optimal and efficient text counterfactuals using Graph Neural Networks
Dimitris Lymperopoulos
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
31
1
0
04 Aug 2024
Enhancing Adversarial Text Attacks on BERT Models with Projected
  Gradient Descent
Enhancing Adversarial Text Attacks on BERT Models with Projected Gradient Descent
Hetvi Waghela
Jaydip Sen
Sneha Rakshit
AAML
SILM
25
2
0
29 Jul 2024
Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation
  of Large Language Models
Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models
Zhuo Chen
Jiawei Liu
Haotan Liu
Qikai Cheng
Qikai Cheng
Wei Lu
Xiaozhong Liu
AAML
36
6
0
18 Jul 2024
AutoBencher: Towards Declarative Benchmark Construction
AutoBencher: Towards Declarative Benchmark Construction
Xiang Lisa Li
E. Liu
Percy Liang
Tatsunori Hashimoto
Percy Liang
Tatsunori Hashimoto
48
2
0
11 Jul 2024
Robust Neural Information Retrieval: An Adversarial and
  Out-of-distribution Perspective
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective
Yu-An Liu
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Yixing Fan
Xueqi Cheng
35
6
0
09 Jul 2024
The Art of Saying No: Contextual Noncompliance in Language Models
The Art of Saying No: Contextual Noncompliance in Language Models
Faeze Brahman
Sachin Kumar
Vidhisha Balachandran
Pradeep Dasigi
Valentina Pyatkin
...
Jack Hessel
Yulia Tsvetkov
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
72
20
0
02 Jul 2024
ViANLI: Adversarial Natural Language Inference for Vietnamese
ViANLI: Adversarial Natural Language Inference for Vietnamese
Tin Van Huynh
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
22
0
0
25 Jun 2024
It Is Not About What You Say, It Is About How You Say It: A Surprisingly
  Simple Approach for Improving Reading Comprehension
It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension
Sagi Shaier
Lawrence E Hunter
K. Wense
44
3
0
24 Jun 2024
First Heuristic Then Rational: Dynamic Use of Heuristics in Language
  Model Reasoning
First Heuristic Then Rational: Dynamic Use of Heuristics in Language Model Reasoning
Yoichi Aoki
Keito Kudo
Tatsuki Kuribayashi
Shusaku Sone
Masaya Taniguchi
Keisuke Sakaguchi
Kentaro Inui
LRM
29
1
0
23 Jun 2024
Saliency Attention and Semantic Similarity-Driven Adversarial
  Perturbation
Saliency Attention and Semantic Similarity-Driven Adversarial Perturbation
Hetvi Waghela
Jaydip Sen
Sneha Rakshit
AAML
28
4
0
18 Jun 2024
People will agree what I think: Investigating LLM's False Consensus Effect
People will agree what I think: Investigating LLM's False Consensus Effect
Junhyuk Choi
Yeseon Hong
Bugeun Kim
54
0
0
16 Jun 2024
RE-RAG: Improving Open-Domain QA Performance and Interpretability with
  Relevance Estimator in Retrieval-Augmented Generation
RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation
Kiseung Kim
Jay-Yoon Lee
RALM
40
5
0
09 Jun 2024
1234...161718
Next