ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.03876
  4. Cited By
OpinionGPT: Modelling Explicit Biases in Instruction-Tuned LLMs

OpinionGPT: Modelling Explicit Biases in Instruction-Tuned LLMs

7 September 2023
Patrick Haller
Ansar Aynetdinov
A. Akbik
ArXivPDFHTML

Papers citing "OpinionGPT: Modelling Explicit Biases in Instruction-Tuned LLMs"

23 / 23 papers shown
Title
SubData: A Python Library to Collect and Combine Datasets for Evaluating
  LLM Alignment on Downstream Tasks
SubData: A Python Library to Collect and Combine Datasets for Evaluating LLM Alignment on Downstream Tasks
Leon Fröhling
Pietro Bernardelle
Gianluca Demartini
ALM
74
0
0
21 Dec 2024
Bias Amplification: Large Language Models as Increasingly Biased Media
Bias Amplification: Large Language Models as Increasingly Biased Media
Ze Wang
Zekun Wu
Jeremy Zhang
Navya Jain
Xin Guan
Skylar Lu
Saloni Gupta
Adriano Soares Koshiyama
37
0
0
19 Oct 2024
LM-PUB-QUIZ: A Comprehensive Framework for Zero-Shot Evaluation of
  Relational Knowledge in Language Models
LM-PUB-QUIZ: A Comprehensive Framework for Zero-Shot Evaluation of Relational Knowledge in Language Models
Max Ploner
Jacek Wiland
Sebastian Pohl
Alan Akbik
KELM
28
1
0
28 Aug 2024
GermanPartiesQA: Benchmarking Commercial Large Language Models for
  Political Bias and Sycophancy
GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy
Jan Batzner
Volker Stocker
Stefan Schmid
Gjergji Kasneci
20
1
0
25 Jul 2024
When LLMs Play the Telephone Game: Cumulative Changes and Attractors in
  Iterated Cultural Transmissions
When LLMs Play the Telephone Game: Cumulative Changes and Attractors in Iterated Cultural Transmissions
Jérémy Perez
Corentin Léger
Grgur Kovač
Cédric Colas
Gaia Molinaro
Maxime Derex
Pierre-Yves Oudeyer
Clément Moulin-Frier
38
5
0
05 Jul 2024
Safety Arithmetic: A Framework for Test-time Safety Alignment of
  Language Models by Steering Parameters and Activations
Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations
Rima Hazra
Sayan Layek
Somnath Banerjee
Soujanya Poria
KELM
LLMSV
29
6
0
17 Jun 2024
The Potential and Challenges of Evaluating Attitudes, Opinions, and
  Values in Large Language Models
The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models
Bolei Ma
Xinpeng Wang
Tiancheng Hu
Anna Haensch
Michael A. Hedderich
Barbara Plank
Frauke Kreuter
ALM
28
2
0
16 Jun 2024
Navigating LLM Ethics: Advancements, Challenges, and Future Directions
Navigating LLM Ethics: Advancements, Challenges, and Future Directions
Junfeng Jiao
S. Afroogh
Yiming Xu
Connor Phillips
AILaw
58
19
0
14 May 2024
Fundus: A Simple-to-Use News Scraper Optimized for High Quality
  Extractions
Fundus: A Simple-to-Use News Scraper Optimized for High Quality Extractions
Max Dallabetta
Conrad Dobberstein
Adrian Breiding
Alan Akbik
19
2
0
22 Mar 2024
Large-Scale Label Interpretation Learning for Few-Shot Named Entity
  Recognition
Large-Scale Label Interpretation Learning for Few-Shot Named Entity Recognition
Jonas Golde
Felix Hamborg
Alan Akbik
19
1
0
21 Mar 2024
Llama meets EU: Investigating the European Political Spectrum through
  the Lens of LLMs
Llama meets EU: Investigating the European Political Spectrum through the Lens of LLMs
Ilias Chalkidis
Stephanie Brandl
26
6
0
20 Mar 2024
Are LLMs Rational Investors? A Study on Detecting and Reducing the
  Financial Bias in LLMs
Are LLMs Rational Investors? A Study on Detecting and Reducing the Financial Bias in LLMs
Yuhang Zhou
Yuchen Ni
Yu Gan
Zhangyue Yin
Xiang Liu
Jian Zhang
Sen Liu
Xipeng Qiu
Guangnan Ye
Hongfeng Chai
AIFin
41
4
0
20 Feb 2024
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned
  Language Models through Task Arithmetic
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
Rishabh Bhardwaj
Do Duc Anh
Soujanya Poria
MoMe
48
35
0
19 Feb 2024
The Political Preferences of LLMs
The Political Preferences of LLMs
David Rozado
30
35
0
02 Feb 2024
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on
  Semantic Textual Similarity
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Ansar Aynetdinov
Alan Akbik
ALM
34
3
0
30 Jan 2024
Analyzing the Inherent Response Tendency of LLMs: Real-World
  Instructions-Driven Jailbreak
Analyzing the Inherent Response Tendency of LLMs: Real-World Instructions-Driven Jailbreak
Yanrui Du
Sendong Zhao
Ming Ma
Yuhan Chen
Bing Qin
20
15
0
07 Dec 2023
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
Xianjun Yang
Xiao Wang
Qi Zhang
Linda R. Petzold
William Yang Wang
Xun Zhao
Dahua Lin
18
159
0
04 Oct 2023
The Empty Signifier Problem: Towards Clearer Paradigms for
  Operationalising "Alignment" in Large Language Models
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models
Hannah Rose Kirk
Bertie Vidgen
Paul Röttger
Scott A. Hale
37
2
0
03 Oct 2023
Fabricator: An Open Source Toolkit for Generating Labeled Training Data
  with Teacher LLMs
Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs
Jonas Golde
Patrick Haller
Felix Hamborg
Julian Risch
A. Akbik
38
8
0
18 Sep 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based
  Bias in NLP
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
257
374
0
28 Feb 2021
Debiasing Pre-trained Contextualised Embeddings
Debiasing Pre-trained Contextualised Embeddings
Masahiro Kaneko
Danushka Bollegala
210
138
0
23 Jan 2021
The Woman Worked as a Babysitter: On Biases in Language Generation
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
206
607
0
03 Sep 2019
1