- From Sparse to Soft Mixtures of Experts [R] https://arxiv.org/abs/2308.00951 2 comments machinelearning
Linking pages
- Non-determinism in GPT-4 is caused by Sparse MoE - 152334H https://152334h.github.io/blog/non-determinism-in-gpt-4/ 181 comments
- Mixtures of Experts - Javid Lakha https://blog.javid.io/p/mixtures-of-experts 2 comments
- Knowing Enough About MoE to Explain Dropped Tokens in GPT-4 - 152334H https://152334h.github.io/blog/knowing-enough-about-moe/ 1 comment
- Prompt engineering: is being an AI 'whisperer' the job of the future or a short-lived fad? https://theconversation.com/prompt-engineering-is-being-an-ai-whisperer-the-job-of-the-future-or-a-short-lived-fad-211833 0 comments
- Making Peace with LLM Non-determinism https://barryzhang.substack.com/p/making-peace-with-llm-non-determinism 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:arxiv.org
Search title: [2308.00951] From Sparse to Soft Mixtures of Experts
See how to search.