Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.05101
Cited By
Decoupled Weight Decay Regularization
14 November 2017
I. Loshchilov
Frank Hutter
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Decoupled Weight Decay Regularization"
50 / 308 papers shown
Title
ExEBench: Benchmarking Foundation Models on Extreme Earth Events
Shan Zhao
Zhitong Xiong
Jie Zhao
Xiao Xiang Zhu
40
0
0
13 May 2025
Bi-directional Self-Registration for Misaligned Infrared-Visible Image Fusion
Timing Li
Bing Cao
Pengfei Zhu
Bin Xiao
Qinghua Hu
34
0
0
11 May 2025
Building-Guided Pseudo-Label Learning for Cross-Modal Building Damage Mapping
Jiepan Li
He Huang
Yu Sheng
Y. Guo
Wei He
46
0
0
08 May 2025
Quiet Feature Learning in Algorithmic Tasks
Prudhviraj Naidu
Zixian Wang
Leon Bergen
R. Paturi
VLM
54
0
0
06 May 2025
MISE: Meta-knowledge Inheritance for Social Media-Based Stressor Estimation
Xin Wang
Ling Feng
Huijun Zhang
Lei Cao
Kaisheng Zeng
Qi Li
Yang Ding
Yi Dai
David A. Clifton
36
0
0
03 May 2025
GENMO: A GENeralist Model for Human MOtion
Jiefeng Li
Jinkun Cao
Haotian Zhang
Davis Rempe
Jan Kautz
Umar Iqbal
Ye Yuan
DiffM
VGen
51
1
0
02 May 2025
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
135
0
0
29 Apr 2025
MERA: Multimodal and Multiscale Self-Explanatory Model with Considerably Reduced Annotation for Lung Nodule Diagnosis
Jiahao Lu
Chong Yin
Silvia Ingala
Kenny Erleben
M. Nielsen
S. Darkner
51
0
0
27 Apr 2025
HFBRI-MAE: Handcrafted Feature Based Rotation-Invariant Masked Autoencoder for 3D Point Cloud Analysis
Xuanhua Yin
Dingxin Zhang
Jianhui Yu
Weidong Cai
25
0
0
19 Apr 2025
Decoupled Diffusion Sparks Adaptive Scene Generation
Yunsong Zhou
Naisheng Ye
William Ljungbergh
Tianyu Li
Jiazhi Yang
Zetong Yang
Hongzi Zhu
Christoffer Petersson
Hongyang Li
42
1
0
14 Apr 2025
Charm: The Missing Piece in ViT fine-tuning for Image Aesthetic Assessment
Fatemeh Behrad
Tinne Tuytelaars
Johan Wagemans
ViT
30
0
0
03 Apr 2025
LeanStereo: A Leaner Backbone based Stereo Network
Rafia Rahim
Samuel Woerz
A. Zell
3DV
42
0
0
24 Mar 2025
BIMBA: Selective-Scan Compression for Long-Range Video Question Answering
Md. Mohaiminul Islam
Tushar Nagarajan
Huiyu Wang
Gedas Bertasius
Lorenzo Torresani
153
0
0
12 Mar 2025
The R2D2 Deep Neural Network Series for Scalable Non-Cartesian Magnetic Resonance Imaging
Yiwei Chen
Amir Aghabiglou
Shijie Chen
Motahare Torki
Chao Tang
Ruud B. van Heeswijk
Yves Wiaux
56
0
0
12 Mar 2025
Improving SAM for Camouflaged Object Detection via Dual Stream Adapters
Jiaming Liu
Linghe Kong
Guihai Chen
73
0
0
08 Mar 2025
Trustworthy Answers, Messier Data: Bridging the Gap in Low-Resource Retrieval-Augmented Generation for Domain Expert Systems
Nayoung Choi
Grace Byun
Andrew Chung
Ellie S. Paek
S. Lee
Jinho D. Choi
RALM
86
1
0
26 Feb 2025
FLINT: Learning-based Flow Estimation and Temporal Interpolation for Scientific Ensemble Visualization
Hamid Gadirov
Jos B. T. M. Roerdink
Steffen Frey
AI4CE
65
1
0
24 Feb 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu-Xi Cheng
KELM
75
3
0
19 Feb 2025
ALGEN: Few-shot Inversion Attacks on Textual Embeddings using Alignment and Generation
Yiyi Chen
Qiongkai Xu
Johannes Bjerva
44
0
0
16 Feb 2025
HiPoNet: A Topology-Preserving Multi-View Neural Network For High Dimensional Point Cloud and Single-Cell Data
Siddharth Viswanath
Hiren Madhu
Dhananjay Bhaskar
Jake Kovalic
Dave Johnson
Rex Ying
Christopher J. Tape
Ian M. Adelstein
Michael Perlmutter
Smita Krishnaswamy
3DPC
84
1
0
11 Feb 2025
Target-Augmented Shared Fusion-based Multimodal Sarcasm Explanation Generation
Palaash Goel
Dushyant Singh Chauhan
Md. Shad Akhtar
LRM
50
0
0
11 Feb 2025
MatSwap: Light-aware material transfers in images
Ivan Lopes
Valentin Deschaintre
Yannick Hold-Geoffroy
Raoul de Charette
DiffM
84
0
0
11 Feb 2025
deCIFer: Crystal Structure Prediction from Powder Diffraction Data using Autoregressive Language Models
Frederik L. Johansen
Ulrik Friis-Jensen
Erik B. Dam
Kirsten M. Ø. Jensen
Rocío Mercado
Raghavendra Selvan
83
0
0
04 Feb 2025
Physics-Grounded Differentiable Simulation for Soft Growing Robots
Lucas Chen
Yitian Gao
Sicheng Wang
Francesco Fuentes
Laura H. Blumenschein
Zachary Kingston
38
0
0
29 Jan 2025
360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation
Hamed Firooz
Maziar Sanjabi
Adrian Englhardt
Aman Gupta
Ben Levine
...
Xiaoling Zhai
Ya Xu
Yu Wang
Yun Dai
Yun Dai
ALM
42
3
0
27 Jan 2025
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Jianing Yang
Alexander Sax
Kevin J Liang
Mikael Henaff
Hao Tang
Ang Cao
J. Chai
Franziska Meier
Matt Feiszli
3DGS
73
16
0
23 Jan 2025
Continuous Urban Change Detection from Satellite Image Time Series with Temporal Feature Refinement and Multi-Task Integration
Sebastian Hafner
Heng Fang
Hossein Azizpour
Y. Ban
54
1
0
20 Jan 2025
Positive2Negative: Breaking the Information-Lossy Barrier in Self-Supervised Single Image Denoising
Tong Li
Lizhi Wang
Zhiyuan Xu
Lin Zhu
Wanxuan Lu
Hua Huang
90
2
0
21 Dec 2024
Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces
Nianze Tao
OOD
OODD
BDL
97
0
0
16 Dec 2024
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
108
5
0
25 Nov 2024
RECAST: Reparameterized, Compact weight Adaptation for Sequential Tasks
Nazia Tasnim
Bryan A. Plummer
CLL
OffRL
74
0
0
25 Nov 2024
Few-Shot Task Learning through Inverse Generative Modeling
Aviv Netanyahu
Yilun Du
Antonia Bronars
Jyothish Pari
J. Tenenbaum
Tianmin Shu
Pulkit Agrawal
49
1
0
07 Nov 2024
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
30
4
0
24 Oct 2024
Lightweight Neural App Control
Filippos Christianos
Georgios Papoudakis
Thomas Coste
Jianye Hao
Jun Wang
Kun Shao
LM&Ro
52
4
0
23 Oct 2024
Physically Guided Deep Unsupervised Inversion for 1D Magnetotelluric Models
Paul Goyes-Peñafiel
Umair bin Waheed
Henry Arguello
16
0
0
20 Oct 2024
Cliqueformer: Model-Based Optimization with Structured Transformers
J. Kuba
Pieter Abbeel
Sergey Levine
OffRL
AI4CE
52
2
0
17 Oct 2024
Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation
Guozhi Liu
Weiwei Lin
Tiansheng Huang
Ruichao Mo
Qi Mu
Li Shen
AAML
58
10
0
13 Oct 2024
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Abhijnan Nath
Changsoo Jung
Ethan Seefried
Nikhil Krishnaswamy
131
1
0
11 Oct 2024
Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery
Pratinav Seth
Michelle Lin
Brefo Dwamena Yaw
Jade Boutot
Mary Kang
David Rolnick
33
0
0
11 Oct 2024
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Haoyi Zhu
Honghui Yang
Yating Wang
Jiange Yang
Limin Wang
Tong He
3DH
51
6
0
10 Oct 2024
Surgical Depth Anything: Depth Estimation for Surgical Scenes using Foundation Models
Ange Lou
Yamin Li
Yike Zhang
Jack Noble
MedIm
24
4
0
09 Oct 2024
Continuous Ensemble Weather Forecasting with Diffusion models
Martin Andrae
Tomas Landelius
Joel Oskarsson
Fredrik Lindsten
AI4Cl
35
2
0
07 Oct 2024
Can Transformers Learn
n
n
n
-gram Language Models?
Anej Svete
Nadav Borenstein
M. Zhou
Isabelle Augenstein
Ryan Cotterell
33
6
0
03 Oct 2024
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
Kevin Xu
Issei Sato
37
3
0
02 Oct 2024
The Conformer Encoder May Reverse the Time Dimension
Robin Schmitt
Albert Zeyer
Mohammad Zeineldeen
Ralf Schluter
Hermann Ney
31
0
0
01 Oct 2024
Semantic Parsing with Candidate Expressions for Knowledge Base Question Answering
Daehwan Nam
Gary Geunbae Lee
38
0
0
01 Oct 2024
Sequential Classification of Misinformation
Daniel Toma
Wasim Huleihel
30
0
0
07 Sep 2024
Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition
Kenzo Clauw
S. Stramaglia
Daniele Marinazzo
50
3
0
16 Aug 2024
GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer
Yihong Lin
Zhaoxin Fan
Lingyu Xiong
Liang Peng
Xiandong Li
Wenxiong Kang
Xianjia Wu
Songju Lei
Huang Xu
36
3
0
03 Aug 2024
Meltemi: The first open Large Language Model for Greek
Leon Voukoutis
Dimitris Roussis
Georgios Paraskevopoulos
Sokratis Sofianopoulos
Prokopis Prokopidis
Vassilis Papavasileiou
Athanasios Katsamanis
Stelios Piperidis
V. Katsouros
VLM
33
7
0
30 Jul 2024
1
2
3
4
5
6
7
Next