When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear TransformersInternational Conference on Learning Representations (ICLR), 2025 |
Can In-context Learning Really Generalize to Out-of-distribution Tasks?International Conference on Learning Representations (ICLR), 2024 |
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization AnalysisInternational Conference on Learning Representations (ICLR), 2024 |