[1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - discu.eu

Hacker News

BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805 5 comments 12/10/2018

Reddit

BERT and Earnies, SNX, outsourcing java and hooking up your router since 1980 https://arxiv.org/abs/1810.04805 9 comments 10/7/2020 wallstreetbets
BERT and earnies (the Versace edition) https://arxiv.org/abs/1810.04805 4 comments 9/7/2020 wallstreetbets

Linking pages

Understanding ChatGPT - Atmosera https://www.atmosera.com/ai/understanding-chatgpt/ 232 comments
AI Canon | Andreessen Horowitz https://a16z.com/2023/05/25/ai-canon/ 219 comments
What We Know About LLMs (Primer) https://willthompson.name/what-we-know-about-llms-primer 164 comments
CASP14: what Google DeepMind’s AlphaFold 2 really achieved, and what it means for protein folding, biology and bioinformatics | Oxford Protein Informatics Group https://www.blopig.com/blog/2020/12/casp14-what-google-deepminds-alphafold-2-really-achieved-and-what-it-means-for-protein-folding-biology-and-bioinformatics/ 159 comments
Dear OpenAI: Please Open Source Your Language Model https://thegradient.pub/openai-please-open-source-your-language-model/ 124 comments
Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes – Google Research Blog https://blog.research.google/2023/09/distilling-step-by-step-outperforming.html 123 comments
It takes a lot of energy for machines to learn – here's why AI is so power-hungry https://theconversation.com/it-takes-a-lot-of-energy-for-machines-to-learn-heres-why-ai-is-so-power-hungry-151825 119 comments
Minerva: Solving Quantitative Reasoning Problems with Language Models – Google AI Blog http://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html 103 comments
Natural language instructions induce compositional generalization in networks of neurons | Nature Neuroscience https://www.nature.com/articles/s41593-024-01607-5 89 comments
Finetuning Large Language Models - by Sebastian Raschka https://magazine.sebastianraschka.com/p/finetuning-large-language-models 72 comments
Notes on training BERT from scratch on an 8GB consumer GPU | sidsite https://sidsite.com/posts/bert-from-scratch/ 67 comments
GitHub - xenova/transformers.js: State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server! https://github.com/xenova/transformers.js 55 comments
Understanding Large Language Models - by Sebastian Raschka https://magazine.sebastianraschka.com/p/understanding-large-language-models 53 comments
Normcore LLM Reads · GitHub https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e 52 comments
Deep language algorithms predict semantic comprehension from brain activity | Scientific Reports https://www.nature.com/articles/s41598-022-20460-9 46 comments
Transformers from scratch | peterbloem.nl http://peterbloem.nl/blog/transformers 40 comments
Language-Agnostic BERT Sentence Embedding – Google AI Blog https://ai.googleblog.com/2020/08/language-agnostic-bert-sentence.html 35 comments
Ideas for Programmers Looking Beyond Web Development | kipply's blog https://carolchen.me/blog/past-webdev/ 30 comments
GitHub - huggingface/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. https://github.com/huggingface/transformers 26 comments
Understanding Large Language Models -- A Transformative Reading List https://sebastianraschka.com/blog/2023/llm-reading-list.html 26 comments

Related searches:

Search whole site: site:arxiv.org

Search title: [1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

See how to search.

Submit link to: