Speeding up Deep Model Training by Sharing Weights and Then Unsharing

8 October 2021

Papers citing "Speeding up Deep Model Training by Sharing Weights and Then Unsharing"

3 / 3 papers shown

Title
A multilevel approach to accelerate the training of Transformers Guillaume Lauga Maël Chaumette Edgar Desainte-Maréville Étienne Lasalle Arthur Lebeurrier AI4CE 29 0 0 24 Apr 2025
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 294 6,003 0 20 Apr 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Z. Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 716 6,435 0 26 Sep 2016