ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.00418
  4. Cited By

Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models

3 January 2025
Martin Pawelczyk
Lillian Sun
Zhenting Qi
Aounon Kumar
Himabindu Lakkaraju
ArXivPDFHTML

Papers citing "Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models"

1 / 1 papers shown
Title
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization
Wei Yao
Wenkai Yang
Z. Wang
Yankai Lin
Yong Liu
ELM
99
1
0
03 Feb 2025
1