Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
ModelsInternational Conference on Machine Learning (ICML), 2024 |
Direct Preference Optimization: Your Language Model is Secretly a Reward
ModelNeural Information Processing Systems (NeurIPS), 2023 |
Tab-CoT: Zero-shot Tabular Chain of ThoughtAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Ziqi Jin Wei Lu |
Large Language Models Can Self-ImproveConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 |
Language Models are Multilingual Chain-of-Thought ReasonersInternational Conference on Learning Representations (ICLR), 2022 |
Large Language Models are Zero-Shot ReasonersNeural Information Processing Systems (NeurIPS), 2022 |
Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022 |
Training language models to follow instructions with human feedbackNeural Information Processing Systems (NeurIPS), 2022 |
GLM: General Language Model Pretraining with Autoregressive Blank
InfillingAnnual Meeting of the Association for Computational Linguistics (ACL), 2021 |
Are NLP Models really able to Solve Simple Math Word Problems?North American Chapter of the Association for Computational Linguistics (NAACL), 2021 |
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit
Reasoning StrategiesTransactions of the Association for Computational Linguistics (TACL), 2021 |
Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020 |