Hacker News
- Non-determinism in GPT-4 is caused by Sparse MoE https://152334h.github.io/blog/non-determinism-in-gpt-4/ 181 comments
Linking pages
- Extracting Hacker News Book Recommendations with the ChatGPT API - reyem.dev blog https://blog.reyem.dev/post/extracting_hn_book_recommendations_with_chatgpt_api/ 206 comments
- Why you should host your LLM? From LLMOps perspective https://tulip4attoo.substack.com/p/why-you-should-host-your-llm-from 1 comment
- Making Peace with LLM Non-determinism https://barryzhang.substack.com/p/making-peace-with-llm-non-determinism 0 comments
Linked pages
- GPT-4 API general availability and deprecation of older models in the Completions API https://openai.com/blog/gpt-4-api-general-availability 546 comments
- OpenAI's plans according to Sam Altman https://humanloop.com/blog/openai-plans 210 comments
- Manifold https://manifold.markets 77 comments
- [2306.02707] Orca: Progressive Learning from Complex Explanation Traces of GPT-4 https://arxiv.org/abs/2306.02707 33 comments
- How continuous batching enables 23x throughput in LLM inference while reducing p50 latency | Anyscale https://www.anyscale.com/blog/continuous-batching-llm-inference 20 comments
- GPT-4 Architecture, Infrastructure, Training Dataset, Costs, Vision, MoE https://www.semianalysis.com/p/gpt-4-architecture-infrastructure 10 comments
- giscus https://giscus.app/ 6 comments
- https://arxiv.org/abs/2101.03961 4 comments
- [2308.00951] From Sparse to Soft Mixtures of Experts https://arxiv.org/abs/2308.00951 3 comments
- https://cdn.openai.com/papers/gpt-4.pdf 1 comment
- Ilya Sutskever (OpenAI Chief Scientist) - Building AGI, Alignment, Spies, Microsoft, & Enlightenment - YouTube https://www.youtube.com/watch?v=Yf1o0TQzry8 0 comments
- [2307.09288] Llama 2: Open Foundation and Fine-Tuned Chat Models https://arxiv.org/abs/2307.09288 0 comments
- Boris Power on X: "@minafahmi_ @OpenAI @goodside This happens with all the models in our API when there’s a tiny difference (&ampampampampampampampampampampampampampampampampampampampampampampampampampampampampampamplt;1%) in probability between the two top tokens, due to non determinism. Once you get one different token then the completions might start to diverge more" / X https://twitter.com/BorisMPower/status/1608522707372740609 0 comments
Related searches:
Search whole site: site:152334h.github.io
Search title: Non-determinism in GPT-4 is caused by Sparse MoE - 152334H
See how to search.