Linking pages
- Advanced Web Scraping: Bypassing "403 Forbidden," captchas, and more | sangaline.com http://sangaline.com/post/advanced-web-scraping-tutorial/ 238 comments
- GitHub - dbohdan/structured-text-tools: A list of command line tools for manipulating structured text data https://github.com/dbohdan/structured-text-tools 106 comments
- GitHub - EntilZha/PyFunctional: Python library for creating data pipelines with chain functional programming https://github.com/EntilZha/PyFunctional 97 comments
- GitHub - tailscale/golink: A private shortlink service for tailnets https://github.com/tailscale/golink 90 comments
- GitHub - EleutherAI/gpt-neox: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. https://github.com/EleutherAI/gpt-neox 67 comments
- GitHub - tidwall/gjson: Get JSON values quickly - JSON parser for Go https://github.com/tidwall/gjson/blob/master/readme.md 66 comments
- Reducing Memory Usage in Ruby | Tenderlove Making https://tenderlovemaking.com/2018/01/23/reducing-memory-usage-in-ruby.html 63 comments
- Parsing 18 billion JSON lines with Go | by Roffe | ITNEXT https://medium.com/@roffe/parsing-18-billion-lines-json-with-go-738be6ee5ed2?amp%3Bsk=0a57d3811168ab4d48c37387f69bb92c&source=friends_link 55 comments
- 2.0 · asciinema blog http://blog.asciinema.org/post/two-point-o/ 46 comments
- GitHub - cicada-lang/whereabouts: Logic programming with JSON http://github.com/cicada-lang/cicada-whereabouts 44 comments
- GitHub - kellyjonbrazil/jc: CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts. https://github.com/kellyjonbrazil/jc 38 comments
- GitHub - asg017/sqlite-lines: A SQLite extension for reading large files line-by-line (NDJSON, logs, txt, etc.) https://github.com/asg017/sqlite-lines 29 comments
- GitHub - tidwall/gjson.rs: Get JSON values quickly - JSON parser for Rust https://github.com/tidwall/gjson.rs 24 comments
- GitHub - egladman/herodotus: An IRC bot written in node.JS that logs a channel's activity and saves it to JSON, JSONL, Markdown, or CSV https://github.com/egladman/herodotus 22 comments
- GitHub - kellyjonbrazil/jello: CLI tool to filter JSON and JSON Lines data with Python syntax. (Similar to jq) https://github.com/kellyjonbrazil/jello 18 comments
- GitHub - sgreben/jp: dead simple terminal plots from JSON data. single binary, no dependencies. linux, osx, windows. https://github.com/sgreben/jp 9 comments
- Extracting Data from Invoices with Google AutoML Natural Language https://www.arthurkoziel.com/automl-invoice-data-extraction/ 7 comments
- Parsing 18 billion JSON lines with Go | by Roffe | ITNEXT https://itnext.io/parsing-18-billion-lines-json-with-go-738be6ee5ed2?amp%3Bsk=0a57d3811168ab4d48c37387f69bb92c&source=friends_link 4 comments
- Parallel stream processing with Rayon | More Stina Blog! https://morestina.net/blog/1432/parallel-stream-processing-with-rayon 4 comments
- Preparing your training data | AutoML Natural Language Documentation | Google Cloud https://cloud.google.com/natural-language/automl/entity-analysis/docs/prepare?_ga=2.192825642.-100249489.1558650056 3 comments