Decoder-Only Transformers: The Workhorse of Generative LLMs - discu.eu

Linking pages

How to Backdoor Large Language Models - by Shrivu Shankar https://blog.sshh.io/p/how-to-backdoor-large-language-models 6 comments
Mixture-of-Experts (MoE): The Birth and Rise of Conditional Computation https://cameronrwolfe.substack.com/p/conditional-computation-the-birth 0 comments
Model Merging: A Survey - by Cameron R. Wolfe, Ph.D. https://cameronrwolfe.substack.com/p/model-merging 0 comments
Scaling Laws for LLMs: From GPT-3 to o3 https://cameronrwolfe.substack.com/p/llm-scaling-laws 0 comments
Mixture-of-Experts (MoE) LLMs - by Cameron R. Wolfe, Ph.D. https://cameronrwolfe.substack.com/p/moe-llms 0 comments
Vision Large Language Models (vLLMs) https://cameronrwolfe.substack.com/p/vision-llms 0 comments
nanoMoE: Mixture-of-Experts (MoE) LLMs from Scratch in PyTorch https://cameronrwolfe.substack.com/p/nano-moe 0 comments

Linked pages

Related searches:

Search whole site: site:cameronrwolfe.substack.com

Search title: Decoder-Only Transformers: The Workhorse of Generative LLMs

See how to search.

Submit link to: