Efficient LLM inference - by Finbarr Timbers - discu.eu

Hacker News

Efficient LLM Inference (2023) https://www.artfintel.com/p/efficient-llm-inference 11 comments 4/1/2024

Linked pages

[2205.01068] OPT: Open Pre-trained Transformer Language Models https://arxiv.org/abs/2205.01068 318 comments
Real number - Wikipedia http://en.wikipedia.org/wiki/Real_number 89 comments
[1503.02531] Distilling the Knowledge in a Neural Network https://arxiv.org/abs/1503.02531 5 comments
[2212.09720] The case for 4-bit precision: k-bit Inference Scaling Laws https://arxiv.org/abs/2212.09720 2 comments
[2203.15556] Training Compute-Optimal Large Language Models https://arxiv.org/abs/2203.15556 0 comments
Large language models aren't trained enough. https://finbarr.ca/llms-not-trained-enough/ 0 comments
Replit - A Recap of Replit Developer Day https://blog.replit.com/replit-developer-day-recap 0 comments
[2210.17323] GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers https://arxiv.org/abs/2210.17323 0 comments

Related searches:

Search whole site: site:www.artfintel.com

Search title: Efficient LLM inference - by Finbarr Timbers

See how to search.

Submit link to: