Search whole site: site:github.com
Search title: GitHub - Ying1123/FlexGen: Running large language models like ChatGPT/GPT-3/OPT-175B on a single GPU. Up to 100x faster than other offloading systems.
See how to search.