Hi-Transformer: Hierarchical Interactive Transformer for Efficient and
Effective Long Document ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2021 |
One Teacher is Enough? Pre-trained Language Model Distillation from
Multiple TeachersFindings (Findings), 2021 |