BeaverTails: Towards Improved Safety Alignment of LLM via a
Human-Preference DatasetNeural Information Processing Systems (NeurIPS), 2023 |
Attention Paper: How Generative AI Reshapes Digital Shadow Industry?ACM Turing Celebration Conference (TC), 2023 |
Editing Large Language Models: Problems, Methods, and OpportunitiesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |