Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.00418
Cited By
Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
3 January 2025
Martin Pawelczyk
Lillian Sun
Zhenting Qi
Aounon Kumar
Himabindu Lakkaraju
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models"
1 / 1 papers shown
Title
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization
Wei Yao
Wenkai Yang
Z. Wang
Yankai Lin
Yong Liu
ELM
99
1
0
03 Feb 2025
1