Hacker News
- Show HN: HuggingFace – Fast tokenization library for deep-learning NLP pipelines https://github.com/huggingface/tokenizers 42 comments
- Huggingface, a well-known NLP library, releases tokenizers in Rust for order of magnitude speed improvement https://github.com/huggingface/tokenizers 5 comments rust
Linking pages
- GitHub - huggingface/candle: Minimalist ML framework for Rust https://github.com/huggingface/candle 205 comments
- GitHub - rust-unofficial/awesome-rust: A curated list of Rust code and resources. https://github.com/rust-unofficial/awesome-rust 178 comments
- GitHub - EleutherAI/gpt-neox: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. https://github.com/EleutherAI/gpt-neox 67 comments
- GitHub - mlc-ai/web-stable-diffusion: Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support. https://github.com/mlc-ai/web-stable-diffusion 42 comments
- GitHub - rust-unofficial/awesome-rust: A curated list of Rust code and resources. https://github.com/kud1ing/awesome-rust 30 comments
- arl/README-Rust.md at master · kaxap/arl · GitHub https://github.com/kaxap/arl/blob/master/README-Rust.md 14 comments
- Preprocessing trillions of tokens with Rust | Work | Mainmatter https://mainmatter.com/cases/aleph-alpha/ 12 comments
- GitHub - huggingface/node-question-answering: Fast and production-ready question answering in Node.js https://github.com/huggingface/node-question-answering 11 comments
- How the BPE tokenization algorithm used by large language models works. | sidsite https://sidsite.com/posts/bpe/ 11 comments
- This Week in Rust 328 · This Week in Rust https://this-week-in-rust.org/blog/2020/03/03/this-week-in-rust-328/ 9 comments
- GitHub - rust-unofficial/awesome-rust: A curated list of Rust code and resources. https://github.com/rust-unofficial/awesome-rust?tab=readme-ov-file#database 9 comments
- GitHub - VKCOM/YouTokenToMe: Unsupervised text tokenizer focused on computational efficiency https://github.com/VKCOM/YouTokenToMe 3 comments
- GitHub - Anush008/fastembed-rs: Library for generating vector embeddings, reranking in Rust https://github.com/anush008/fastembed-rs 3 comments
- Neural text generation from 1 million Belgian real estate deeds | by Anna Krogager | ML6team https://blog.ml6.eu/neural-text-generation-from-1-million-belgian-real-estate-deeds-9230c940432c?sk=62f009f9f9fad83edc75f9adb8b04069&source=friends_link 1 comment
- Hands-on with Hugging Face’s new tokenizers library | by Omar M’Haimdat | Heartbeat https://heartbeat.fritz.ai/hands-on-with-hugging-faces-new-tokenizers-library-baff35d7b465?source=post_stats_page--------------------------- 0 comments
- GitHub - ml-tooling/best-of-ml-python: 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly. https://github.com/ml-tooling/best-of-ml-python 0 comments
- Hands-on with Hugging Face’s new tokenizers library | by Omar M’Haimdat | Heartbeat https://heartbeat.fritz.ai/hands-on-with-hugging-faces-new-tokenizers-library-baff35d7b465 0 comments
- AWS and Hugging Face collaborate to simplify and accelerate adoption of Natural Language Processing models | AWS Machine Learning Blog https://aws.amazon.com/blogs/machine-learning/aws-and-hugging-face-collaborate-to-simplify-and-accelerate-adoption-of-natural-language-processing-models/ 0 comments
- Driving efficiencies in your AI process | by Georgian | Georgian Impact Blog | Medium https://medium.com/georgian-impact-blog/driving-efficiencies-in-your-ai-process-31f23d331fe7 0 comments
- A Fast WordPiece Tokenization System – Google AI Blog https://ai.googleblog.com/2021/12/a-fast-wordpiece-tokenization-system.html 0 comments
Would you like to stay up to date with Rust? Checkout Rust
Weekly.
Related searches:
Search whole site: site:github.com
Search title: GitHub - huggingface/tokenizers: đź’Ą Fast State-of-the-Art Tokenizers optimized for Research and Production
See how to search.