ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.03827
  4. Cited By
Discovering Latent Knowledge in Language Models Without Supervision

Discovering Latent Knowledge in Language Models Without Supervision

7 December 2022
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
ArXivPDFHTML

Papers citing "Discovering Latent Knowledge in Language Models Without Supervision"

17 / 267 papers shown
Title
Eight Things to Know about Large Language Models
Eight Things to Know about Large Language Models
Sam Bowman
ALM
15
108
0
02 Apr 2023
Large Language Model Instruction Following: A Survey of Progresses and
  Challenges
Large Language Model Instruction Following: A Survey of Progresses and Challenges
Renze Lou
Kai Zhang
Wenpeng Yin
ALM
LRM
21
19
0
18 Mar 2023
Analyzing And Editing Inner Mechanisms Of Backdoored Language Models
Analyzing And Editing Inner Mechanisms Of Backdoored Language Models
Max Lamparth
Anka Reuel
KELM
22
10
0
24 Feb 2023
Task-Specific Skill Localization in Fine-tuned Language Models
Task-Specific Skill Localization in Fine-tuned Language Models
A. Panigrahi
Nikunj Saunshi
Haoyu Zhao
Sanjeev Arora
MoMe
21
66
0
13 Feb 2023
Truth Machines: Synthesizing Veracity in AI Language Models
Truth Machines: Synthesizing Veracity in AI Language Models
Luke Munn
Liam Magee
Vanicka Arora
SyDa
HILM
8
27
0
28 Jan 2023
The Alignment Problem from a Deep Learning Perspective
The Alignment Problem from a Deep Learning Perspective
Richard Ngo
Lawrence Chan
Sören Mindermann
29
180
0
30 Aug 2022
Toward Transparent AI: A Survey on Interpreting the Inner Structures of
  Deep Neural Networks
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Tilman Raukur
A. Ho
Stephen Casper
Dylan Hadfield-Menell
AAML
AI4CE
18
123
0
27 Jul 2022
Maieutic Prompting: Logically Consistent Reasoning with Recursive
  Explanations
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Jaehun Jung
Lianhui Qin
Sean Welleck
Faeze Brahman
Chandra Bhagavatula
Ronan Le Bras
Yejin Choi
ReLM
LRM
206
189
0
24 May 2022
Teaching language models to support answers with verified quotes
Teaching language models to support answers with verified quotes
Jacob Menick
Maja Trebacz
Vladimir Mikulik
John Aslanides
Francis Song
...
Mia Glaese
Susannah Young
Lucy Campbell-Gillingham
G. Irving
Nat McAleese
ELM
RALM
226
255
0
21 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
Truthful AI: Developing and governing AI that does not lie
Truthful AI: Developing and governing AI that does not lie
Owain Evans
Owen Cotton-Barratt
Lukas Finnveden
Adam Bales
Avital Balwit
Peter Wills
Luca Righetti
William Saunders
HILM
220
107
0
13 Oct 2021
Unsolved Problems in ML Safety
Unsolved Problems in ML Safety
Dan Hendrycks
Nicholas Carlini
John Schulman
Jacob Steinhardt
164
268
0
28 Sep 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming
  Few-Shot Prompt Order Sensitivity
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
274
1,114
0
18 Apr 2021
AI safety via debate
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
199
199
0
02 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
Previous
123456