Hacker News
- ROCm LLM inference gives 7900XTX 80% speed of a 4090 https://github.com/mlc-ai/mlc-llm/ 121 comments amd
Linking pages
- How to run Llama 2 on your laptop and your phone - Replicate – Replicate https://replicate.com/blog/run-llama-locally 171 comments
- Accelerating Generative AI with PyTorch II: GPT, Fast | PyTorch https://pytorch.org/blog/accelerating-generative-ai-2/ 69 comments
- GitHub - operand/agency: A fast and minimal foundation for unifying human, AI, and other computing systems, in python https://github.com/operand/agency 15 comments
- No Cloud Required: Chatbot Runs Locally on iPhones, Old PCs | Tom's Hardware https://www.tomshardware.com/news/mlc-ai-lightweight-chatbot 11 comments
- GitHub - mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. https://github.com/mlabonne/llm-course 10 comments
- Web LLM lets you run LLMs natively in your frontend using the new WebGPU standard. | Monarch Wadia https://www.monarchwadia.com/2024/02/23/running-llms-in-the-browser.html 4 comments
- LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML https://www.latent.space/p/llms-everywhere#details 1 comment
- GitHub - oscinis-com/Awesome-LLM-Productization: Awesome-LLM-Productization: a curated list of tools/tricks/news/regulations about AI and Large Language Model (LLM) productization https://github.com/oscinis-com/Awesome-LLM-Productization 1 comment
- GitHub - vince-lam/awesome-local-llms: Identify popular and active GitHub repos for hosting local LLMs https://github.com/vince-lam/awesome-local-llms 1 comment
- @mlc-ai/web-llm - npm https://www.npmjs.com/package/@mlc-ai/web-llm 0 comments
- GitHub - kayvr/token-hawk: WebGPU LLM inference tuned by hand https://github.com/kayvr/token-hawk 0 comments
- A practical guide to deploying Large Language Models Cheap, Good *and* Fast https://askdala.substack.com/p/a-pratical-guide-to-deploying-llms 0 comments
- Hardware requirements for LLM's in production https://bionic-gpt.com/blog/llm-hardware/ 0 comments
- GitHub - AIoT-MLSys-Lab/Efficient-LLMs-Survey: Efficient Large Language Models: A Survey https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey 0 comments
- Cascade Inference: Memory Bandwidth Efficient Shared Prefix Batch Decoding | FlashInfer https://flashinfer.ai/2024/02/02/cascade-inference.html 0 comments
- GitHub - NexaAI/Awesome-LLMs-on-device: Awesome LLMs on Device: A Comprehensive Survey https://github.com/NexaAI/Awesome-LLMs-on-device 0 comments
Linked pages
Related searches:
Search whole site: site:github.com
Search title: GitHub - mlc-ai/mlc-llm: Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
See how to search.