Linking pages
- Hello Qwen2 | Qwen https://qwenlm.github.io/blog/qwen2/ 130 comments
- GitHub - 01-ai/Yi-1.5: Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability. https://github.com/01-ai/Yi-1.5 67 comments
- Qwen2.5: A Party of Foundation Models! | Qwen https://qwenlm.github.io/blog/qwen2.5/ 38 comments
- Introducing Qwen1.5 | Qwen https://qwenlm.github.io/blog/qwen1.5/ 3 comments
- Aman's AI Journal • Primers • Overview of Large Language Models https://aman.ai/primers/ai/LLM/ 1 comment
- Qwen2-VL: To See the World More Clearly | Qwen https://qwenlm.github.io/blog/qwen2-vl/ 1 comment
- GitHub - google-gemini/gemma-cookbook: A collection of guides and examples for the Gemma open models from Google. https://github.com/google-gemini/gemma-cookbook 0 comments
Linked pages
- Hello Qwen2 | Qwen https://qwenlm.github.io/blog/qwen2/ 130 comments
- GitHub - unslothai/unsloth: 2x faster 50% less memory LLM finetuning https://github.com/unslothai/unsloth 122 comments
- [2404.02258] Mixture-of-Depths: Dynamically allocating compute in transformer-based language models https://arxiv.org/abs/2404.02258 103 comments
- GitHub - InternLM/InternLM: Official release of InternLM2.5 base and chat models. 1M context support https://github.com/InternLM/InternLM 89 comments
- GitHub - gradio-app/gradio: Create UIs for your machine learning model in Python in 3 minutes https://github.com/gradio-app/gradio 48 comments
- [2402.12354] LoRA+: Efficient Low Rank Adaptation of Large Models https://arxiv.org/abs/2402.12354 47 comments
- bigcode (BigCode) https://huggingface.co/bigcode 37 comments
- Weights & Biases: The AI Developer Platform http://wandb.ai/ 23 comments
- [2406.14546] Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data https://arxiv.org/abs/2406.14546 13 comments
- [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
- bigcode/the-stack · Datasets at Hugging Face https://huggingface.co/datasets/bigcode/the-stack 5 comments
- GitHub - artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs https://github.com/artidoro/qlora 5 comments
- GitHub - lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. https://github.com/lm-sys/FastChat 4 comments
- HuggingFaceFW/fineweb · Datasets at Hugging Face https://huggingface.co/datasets/HuggingFaceFW/fineweb 4 comments
- Introducing Qwen1.5 | Qwen https://qwenlm.github.io/blog/qwen1.5/ 3 comments
- google (Google AI) https://huggingface.co/google 2 comments
- GitHub - tatsu-lab/stanford_alpaca https://github.com/tatsu-lab/stanford_alpaca 2 comments
- meta-llama (Meta Llama 2) https://huggingface.co/meta-llama 1 comment
- https://platform.openai.com/docs/api-reference/chat/create#chat-create-logprobs 1 comment
- [2405.05378] "They are uncultured": Unveiling Covert Harms and Social Threats in LLM Generated Conversations https://arxiv.org/abs/2405.05378 1 comment
Related searches:
Search whole site: site:github.com
Search title: GitHub - hiyouga/LLaMA-Factory: A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
See how to search.