Linking pages
- Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments - Lightning AI https://lightning.ai/pages/community/lora-insights/ 39 comments
- Understanding the scaling of L² regularization in the context of neural networks | by Shay Palachy | Towards Data Science https://medium.com/@shay.palachy/understanding-the-scaling-of-l%C2%B2-regularization-in-the-context-of-neural-networks-e3d25f8b50db 15 comments
- GitHub - nicklashansen/neural-net-optimization: PyTorch implementations of recent optimization algorithms for deep learning. https://github.com/nicklashansen/neural-net-optimization 10 comments
- What is Llama 2? Meta’s large language model explained | InfoWorld https://www.infoworld.com/article/3706470/what-is-llama-2-metas-large-language-model-explained.html 6 comments
- AdamW — PyTorch 2.3 documentation https://pytorch.org/docs/stable/generated/torch.optim.AdamW.html 3 comments
- How I got into deep learning - Vikas Paruchuri https://www.vikas.sh/post/how-i-got-into-deep-learning 2 comments
- minGPT in Julia using Flux! | Can Candan https://cancandan.github.io/julia/flux/machine-learning/2022/03/30/mingpt-julia.html 1 comment
- A deep multi-stream model for robust prediction of left ventricular ejection fraction in 2D echocardiography | Scientific Reports https://www.nature.com/articles/s41598-024-52480-y 1 comment
- GitHub - amrzv/awesome-colab-notebooks: Collection of google colaboratory notebooks for fast and easy experiments https://github.com/amrzv/awesome-colab-notebooks 0 comments
- weight decay vs L2 regularization https://bbabenko.github.io/weight-decay 0 comments
- Assembly AI | Comet ML https://www.comet.ml/site/customer-case-study-building-an-end-to-end-speech-recognition-model-in-pytorch-with-assemblyai/ 0 comments
- GitHub - zoq/Awesome-Optimizer: Collect optimizer related papers, data, repositories https://github.com/zoq/Awesome-Optimizer 0 comments
- How to train a model to generate Movie/T.V show plot from a poster | by Deepak | Medium https://medium.com/@dsr.ai/how-to-train-a-model-to-generate-movie-t-v-show-plot-from-a-poster-eec6aea454ca 0 comments
- Deepfake Detection Based on Original Human Biometric Traits - Unite.AI https://www.unite.ai/deepfake-detection-based-on-original-human-biometric-traits/ 0 comments
- Ahead of AI #4: A Big Year For AI - by Sebastian Raschka https://magazine.sebastianraschka.com/p/ahead-of-ai-4-a-big-year-for-ai 0 comments
- Influential Machine Learning Papers Of 2022 https://sebastianraschka.com/blog/2023/top10-papers-2022.html 0 comments
- Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution | BMC Medical Informatics and Decision Making | Full Text https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-023-02129-z#auth-Wei-Yang 0 comments
- The mathematics of optimization for deep learning https://thepalindrome.org/p/the-math-of-optimization-for-deep-learning 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [1711.05101] Decoupled Weight Decay Regularization
See how to search.