ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.05497
  4. Cited By
Improving Sharpness-Aware Minimization with Fisher Mask for Better
  Generalization on Language Models

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models

11 October 2022
Qihuang Zhong
Liang Ding
Li Shen
Peng Mi
Juhua Liu
Bo Du
Dacheng Tao
    AAML
ArXivPDFHTML

Papers citing "Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models"

13 / 13 papers shown
Title
Diversifying the Mixture-of-Experts Representation for Language Models
  with Orthogonal Optimizer
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer
Boan Liu
Liang Ding
Li Shen
Keqin Peng
Yu Cao
Dazhao Cheng
Dacheng Tao
MoE
34
7
0
15 Oct 2023
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Qihuang Zhong
Liang Ding
Juhua Liu
Xuebo Liu
Min Zhang
Bo Du
Dacheng Tao
VLM
27
9
0
24 May 2023
Dynamic Regularized Sharpness Aware Minimization in Federated Learning:
  Approaching Global Consistency and Smooth Landscape
Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape
Yan Sun
Li Shen
Shi-Yong Chen
Liang Ding
Dacheng Tao
FedML
14
33
0
19 May 2023
Robust Generalization against Photon-Limited Corruptions via Worst-Case
  Sharpness Minimization
Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization
Zhuo Huang
Miaoxi Zhu
Xiaobo Xia
Li Shen
Jun Yu
Chen Gong
Bo Han
Bo Du
Tongliang Liu
30
30
0
23 Mar 2023
Can ChatGPT Understand Too? A Comparative Study on ChatGPT and
  Fine-tuned BERT
Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
AI4MH
47
231
0
19 Feb 2023
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language
  Understanding and Generation
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
29
26
0
30 May 2022
Sharpness-Aware Minimization Improves Language Model Generalization
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
119
97
0
16 Oct 2021
SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
Tu Vu
Brian Lester
Noah Constant
Rami Al-Rfou
Daniel Matthew Cer
VLM
LRM
134
276
0
15 Oct 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural
  Networks
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Joey Tianyi Zhou
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
102
132
0
07 Oct 2021
Raise a Child in Large Language Model: Towards Effective and
  Generalizable Fine-tuning
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
Runxin Xu
Fuli Luo
Zhiyuan Zhang
Chuanqi Tan
Baobao Chang
Songfang Huang
Fei Huang
LRM
136
177
0
13 Sep 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
Teaching Machines to Read and Comprehend
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
170
3,504
0
10 Jun 2015
1