GitHub - MDK8888/GPTFast: Accelerate your Hugging Face Transformers 6-7x. Native to Hugging Face and PyTorch. - discu.eu

Hacker News

A Python Library to 6-7x the inference speed of your HF models https://github.com/MDK8888/GPTFast 15 comments 22/2/2024

Reddit

[N] LLM models up to 7 times acceleration. http://github.com/MDK8888/GPTFast 3 comments 3/3/2024 machinelearning

Linked pages

[2211.17192] Fast Inference from Transformers via Speculative Decoding https://arxiv.org/abs/2211.17192 2 comments
GitHub - pytorch-labs/gpt-fast: Simple and efficient pytorch-native transformer text generation in <1000 LOC of python. https://github.com/pytorch-labs/gpt-fast 1 comment

Related searches:

Search whole site: site:github.com

Search title: GitHub - MDK8888/GPTFast: Accelerate your Hugging Face Transformers 6-7x. Native to Hugging Face and PyTorch.

See how to search.

Submit link to: