GitHub - mlc-ai/mlc-llm: Enable everyone to develop, optimize and deploy AI models natively on everyone's devices. - discu.eu

Hacker News

MLC-LLM: GPT/Llama on consumer-class GPUs and phones https://github.com/mlc-ai/mlc-llm 106 comments 30/4/2023

Reddit

ROCm LLM inference gives 7900XTX 80% speed of a 4090 https://github.com/mlc-ai/mlc-llm/ 121 comments 10/8/2023 amd

Linking pages

How to run Llama 2 on your laptop and your phone - Replicate – Replicate https://replicate.com/blog/run-llama-locally 171 comments
Accelerating Generative AI with PyTorch II: GPT, Fast | PyTorch https://pytorch.org/blog/accelerating-generative-ai-2/ 69 comments
GitHub - operand/agency: A fast and minimal foundation for unifying human, AI, and other computing systems, in python https://github.com/operand/agency 15 comments
No Cloud Required: Chatbot Runs Locally on iPhones, Old PCs | Tom's Hardware https://www.tomshardware.com/news/mlc-ai-lightweight-chatbot 11 comments
GitHub - mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. https://github.com/mlabonne/llm-course 10 comments
Web LLM lets you run LLMs natively in your frontend using the new WebGPU standard. | Monarch Wadia https://www.monarchwadia.com/2024/02/23/running-llms-in-the-browser.html 4 comments
LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML https://www.latent.space/p/llms-everywhere#details 1 comment
GitHub - oscinis-com/Awesome-LLM-Productization: Awesome-LLM-Productization: a curated list of tools/tricks/news/regulations about AI and Large Language Model (LLM) productization https://github.com/oscinis-com/Awesome-LLM-Productization 1 comment
GitHub - vince-lam/awesome-local-llms: Identify popular and active GitHub repos for hosting local LLMs https://github.com/vince-lam/awesome-local-llms 1 comment
GitHub - Cloud-Code-AI/BrowserAI: Run local LLMs like llama, deepseek-distill, kokoro and more inside your browser https://git.new/browserai 1 comment
@mlc-ai/web-llm - npm https://www.npmjs.com/package/@mlc-ai/web-llm 0 comments
GitHub - kayvr/token-hawk: WebGPU LLM inference tuned by hand https://github.com/kayvr/token-hawk 0 comments
A practical guide to deploying Large Language Models Cheap, Good *and* Fast https://askdala.substack.com/p/a-pratical-guide-to-deploying-llms 0 comments
Hardware requirements for LLM's in production https://bionic-gpt.com/blog/llm-hardware/ 0 comments
GitHub - AIoT-MLSys-Lab/Efficient-LLMs-Survey: Efficient Large Language Models: A Survey https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey 0 comments
Cascade Inference: Memory Bandwidth Efficient Shared Prefix Batch Decoding | FlashInfer https://flashinfer.ai/2024/02/02/cascade-inference.html 0 comments
GitHub - NexaAI/Awesome-LLMs-on-device: Awesome LLMs on Device: A Comprehensive Survey https://github.com/NexaAI/Awesome-LLMs-on-device 0 comments
Sorting-Free GPU Kernels for LLM Sampling | FlashInfer https://flashinfer.ai/2025/03/10/sampling.html 0 comments

Linked pages

Related searches:

Search whole site: site:github.com

Search title: GitHub - mlc-ai/mlc-llm: Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

See how to search.

Submit link to: