iptv techs

IPTV Techs

  • Home
  • Tech News
  • Lgeting Rate Scaling at Initialization is All You Need

Lgeting Rate Scaling at Initialization is All You Need


Lgeting Rate Scaling at Initialization is All You Need


View a PDF of the paper titled No More Adam: Lgeting Rate Scaling at Initialization is All You Need, by Minghao Xu and 3 other authors

View PDF

Abstract:In this labor, we inquire the necessity of alterive gradient methods for training meaningful neural netlabors. SGD-SaI is a basic yet effective increasement to stochastic gradient descent with momentum (SGDM). SGD-SaI carry outs lgeting rate Scaling at Initialization (SaI) to contrastent parameter groups, directd by their esteemive gradient signal-to-noise ratios (g-SNR). By adequitableing lgeting rates without count oning on alterive second-order momentum, SGD-SaI helps impede training imequilibriums from the very first iteration and cuts the enhancer’s memory usage by half contrastd to AdamW. Despite its simpliedy and efficiency, SGD-SaI stablely suites or outcarry outs AdamW in training a variety of Transestablisher-based tasks, effectively overcoming a prolonged-standing dispute of using SGD for training Transestablishers. SGD-SaI excels in ImageNet-1K classification with Vision Transestablishers(ViT) and GPT-2 pretraining for huge language models (LLMs, alterer decoder-only), demonstrating strongness to hyperparameter variations and down-to-earthity for diverse applications. We further tested its strongness on tasks enjoy LoRA fine-tuning for LLMs and diffusion models, where it stablely outcarry outs state-of-the-art enhancers. From a memory efficiency perspective, SGD-SaI accomplishs substantial memory savings for enhancer states, reducing memory usage by 5.93 GB for GPT-2 (1.5B parameters) and 25.15 GB for Llama2-7B contrastd to AdamW in filled-precision training settings.

Subleave oution history

From: Minghao Xu [see email]
[v1]
Mon, 16 Dec 2024 13:41:37 UTC (1,216 KB)
[v2]
Tue, 17 Dec 2024 09:30:44 UTC (1,216 KB)

Source connect


Leave a Reply

Your email address will not be published. Required fields are marked *

Thank You For The Order

Please check your email we sent the process how you can get your account

Select Your Plan