ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.19048
  4. Cited By
A Framework for Real-time Safeguarding the Text Generation of Large Language Model
v1v2v3 (latest)

A Framework for Real-time Safeguarding the Text Generation of Large Language Model

29 April 2024
Ximing Dong
Dayi Lin
Shaowei Wang
Ahmed E. Hassan
ArXiv (abs)PDFHTML

Papers citing "A Framework for Real-time Safeguarding the Text Generation of Large Language Model"

31 / 31 papers shown
Title
CARE: Decoding Time Safety Alignment via Rollback and Introspection Intervention
CARE: Decoding Time Safety Alignment via Rollback and Introspection Intervention
Xiaomeng Hu
Fei Huang
Chenhan Yuan
Junyang Lin
Tsung-Yi Ho
116
1
0
01 Sep 2025
LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework for Seamless Design of Multi Active/Passive Core-Agent Architectures
LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework for Seamless Design of Multi Active/Passive Core-Agent ArchitecturesInformation Fusion (Inf. Fusion), 2024
Amine B. Hassouna
Hana Chaari
Ines Belhaj
LLMAG
271
8
0
17 Sep 2024
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical
  Gradient Analysis
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis
Yueqi Xie
Minghong Fang
Renjie Pi
Neil Zhenqiang Gong
246
60
0
21 Feb 2024
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Hakan Inan
Kartikeya Upasani
Jianfeng Chi
Rashi Rungta
Krithika Iyer
...
Michael Tontchev
Qing Hu
Brian Fuller
Davide Testuggine
Madian Khabsa
AI4MH
393
706
0
07 Dec 2023
A Survey on Hallucination in Large Language Models: Principles,
  Taxonomy, Challenges, and Open Questions
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Lei Huang
Weijiang Yu
Weitao Ma
Weihong Zhong
Zhangyin Feng
...
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
LRMHILM
338
1,778
0
09 Nov 2023
Copyright Violations and Large Language Models
Copyright Violations and Large Language Models
Antonia Karamolegkou
Jiaang Li
Li Zhou
Anders Sogaard
186
104
0
20 Oct 2023
Fine-tuning Aligned Language Models Compromises Safety, Even When Users
  Do Not Intend To!
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!International Conference on Learning Representations (ICLR), 2023
Xiangyu Qi
Yi Zeng
Tinghao Xie
Pin-Yu Chen
Ruoxi Jia
Prateek Mittal
Peter Henderson
SILM
297
878
0
05 Oct 2023
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language ModelsComputational Linguistics (CL), 2023
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
Anh Tuan Luu
Freda Shi
Shuming Shi
Shuming Shi
LRMRALMHILM
626
784
0
03 Sep 2023
PaLM 2 Technical Report
PaLM 2 Technical Report
Rohan Anil
Andrew M. Dai
Orhan Firat
Melvin Johnson
Dmitry Lepikhin
...
Ce Zheng
Wei Zhou
Denny Zhou
Slav Petrov
Yonghui Wu
ReLMLRM
544
1,382
0
17 May 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAGMLLM
3.6K
20,141
0
15 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALMPILM
2.7K
17,255
0
27 Feb 2023
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on
  Reasoning, Hallucination, and Interactivity
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and InteractivityInternational Joint Conference on Natural Language Processing (IJCNLP), 2023
Yejin Bang
Samuel Cahyawijaya
Nayeon Lee
Wenliang Dai
Jane Polak Scowcroft
...
Tiezheng Yu
Willy Chung
Quyet V. Do
Yan Xu
Pascale Fung
ReLMLRM
614
1,591
0
08 Feb 2023
Critic-Guided Decoding for Controlled Text Generation
Critic-Guided Decoding for Controlled Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Minbeom Kim
Hwanhee Lee
Kang Min Yoo
Joonsuk Park
Hwaran Lee
Kyomin Jung
304
43
0
21 Dec 2022
Constitutional AI: Harmlessness from AI Feedback
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDaMoMe
614
2,205
0
15 Dec 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLMBDLLRMAI4CE
1.6K
5,270
0
21 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedbackNeural Information Processing Systems (NeurIPS), 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
2.0K
16,987
0
04 Mar 2022
Controllable Natural Language Generation with Contrastive Prefixes
Controllable Natural Language Generation with Contrastive PrefixesFindings (Findings), 2022
Jing Qian
Li Dong
Yelong Shen
Furu Wei
Weizhu Chen
173
111
0
27 Feb 2022
A Survey of Controllable Text Generation using Transformer-based
  Pre-trained Language Models
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language ModelsACM Computing Surveys (ACM CSUR), 2022
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
446
282
0
14 Jan 2022
Recent Advances in Natural Language Processing via Large Pre-Trained
  Language Models: A Survey
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A SurveyACM Computing Surveys (CSUR), 2021
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MAVLMAI4CE
386
1,333
0
01 Nov 2021
A Survey of Data Augmentation Approaches for NLP
A Survey of Data Augmentation Approaches for NLPFindings (Findings), 2021
Steven Y. Feng
Varun Gangal
Jason W. Wei
Sarath Chandar
Soroush Vosoughi
Teruko Mitamura
Eduard H. Hovy
AIMat
596
897
0
07 May 2021
DExperts: Decoding-Time Controlled Text Generation with Experts and
  Anti-Experts
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-ExpertsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Alisa Liu
Maarten Sap
Ximing Lu
Swabha Swayamdipta
Chandra Bhagavatula
Noah A. Smith
Yejin Choi
MU
465
431
0
07 May 2021
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
GPT3Mix: Leveraging Large-scale Language Models for Text AugmentationConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Kang Min Yoo
Dongju Park
Jaewook Kang
Sang-Woo Lee
Woomyeong Park
278
269
0
18 Apr 2021
FUDGE: Controlled Text Generation With Future Discriminators
FUDGE: Controlled Text Generation With Future DiscriminatorsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Kevin Kaichuang Yang
Dan Klein
247
381
0
12 Apr 2021
Measuring and Improving Consistency in Pretrained Language Models
Measuring and Improving Consistency in Pretrained Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2021
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
554
425
0
01 Feb 2021
Data Boost: Text Data Augmentation Through Reinforcement Learning Guided
  Conditional Generation
Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Ruibo Liu
Guangxuan Xu
Chenyan Jia
Weicheng Ma
Lili Wang
Soroush Vosoughi
145
114
0
05 Dec 2020
GeDi: Generative Discriminator Guided Sequence Generation
GeDi: Generative Discriminator Guided Sequence GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Ben Krause
Akhilesh Deepak Gotmare
Bryan McCann
N. Keskar
Shafiq Joty
R. Socher
Nazneen Rajani
342
451
0
14 Sep 2020
Best-First Beam Search
Best-First Beam SearchTransactions of the Association for Computational Linguistics (TACL), 2020
Clara Meister
Tim Vieira
Robert Bamler
340
79
0
08 Jul 2020
Plug and Play Language Models: A Simple Approach to Controlled Text
  Generation
Plug and Play Language Models: A Simple Approach to Controlled Text GenerationInternational Conference on Learning Representations (ICLR), 2019
Sumanth Dathathri
Andrea Madotto
Janice Lan
Jane Hung
Eric Frank
Piero Molino
J. Yosinski
Rosanne Liu
KELM
327
1,077
0
04 Dec 2019
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
1.4K
2,154
0
18 Sep 2019
CTRL: A Conditional Transformer Language Model for Controllable
  Generation
CTRL: A Conditional Transformer Language Model for Controllable Generation
N. Keskar
Bryan McCann
Lav Varshney
Caiming Xiong
R. Socher
AI4CE
581
1,351
0
11 Sep 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-NetworksConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Nils Reimers
Iryna Gurevych
1.8K
15,116
0
27 Aug 2019
1