
Title |
|---|
![]() The AI Productivity Index (APEX) Bertie Vidgen Abby Fennelly Evan Pinnix Chirag Mahapatra Zach Richards ...Eric Topol Osvald Nitski Eric Topol Brendan Foody Osvald Nitski |
![]() LiveCodeBench: Holistic and Contamination Free Evaluation of Large
Language Models for CodeInternational Conference on Learning Representations (ICLR), 2024 |
![]() SWE-bench: Can Language Models Resolve Real-World GitHub Issues?International Conference on Learning Representations (ICLR), 2023 |
![]() Goal Driven Discovery of Distributional Differences via Language
DescriptionsNeural Information Processing Systems (NeurIPS), 2023 |