Hacker News
Linking pages
Linked pages
- GitHub - huggingface/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. https://github.com/huggingface/transformers 26 comments
- [2106.09685] LoRA: Low-Rank Adaptation of Large Language Models https://arxiv.org/abs/2106.09685 8 comments
- GitHub - microsoft/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. https://github.com/microsoft/DeepSpeed 1 comment
- GitHub - NVIDIA/FasterTransformer: Transformer related optimization, including BERT, GPT https://github.com/NVIDIA/FasterTransformer/ 1 comment
- Dissecting Batching Effects in GPT Inference https://le.qun.ch/en/blog/2023/05/13/transformer-batching/ 0 comments
- GitHub - vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMs https://github.com/vllm-project/vllm 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - punica-ai/punica: Serving multiple LoRA finetuned LLM as one
See how to search.