Hacker News
- The State of Web Scraping 2022 https://scrapeops.io/blog/the-state-of-web-scraping-2022/ 144 comments
Linked pages
- GitHub - microsoft/playwright: Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API. https://github.com/microsoft/playwright 239 comments
- Symfony, High Performance PHP Framework for Web Development https://symfony.com/ 187 comments
- jsoup: Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety https://jsoup.org/ 42 comments
- GitHub - binux/pyspider: A Powerful Spider(Web Crawler) System in Python. https://github.com/binux/pyspider 40 comments
- GitHub - gocolly/colly: Elegant Scraper and Crawler Framework for Golang https://github.com/gocolly/colly/tree/master 39 comments
- PhantomJS - Scriptable Headless Browser http://www.phantomjs.org/ 35 comments
- Apify: Full-stack web scraping and data extraction platform https://www.apify.com/ 21 comments
- Diffbot | Knowledge Graph, AI Web Data Extraction and Crawling http://diffbot.com/ 5 comments
- ScrapeOps - The DevOps Tool For Web Scraping. | ScrapeOps https://scrapeops.io/ 2 comments
- GitHub - alirezamika/autoscraper: A Smart, Automatic, Fast and Lightweight Web Scraper for Python https://github.com/alirezamika/autoscraper 1 comment
- GitHub - PuerkitoBio/goquery: A little like that j-thing, only in Go. https://github.com/PuerkitoBio/goquery 1 comment
- GitHub - FriendsOfPHP/Goutte: Goutte, a simple PHP Web Scraper https://github.com/FriendsOfPHP/Goutte 0 comments
- HtmlUnit – Welcome to HtmlUnit https://htmlunit.sourceforge.io/ 0 comments
- GitHub - vifreefly/kimuraframework: Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites https://github.com/vifreefly/kimuraframework 0 comments
- Nokogiri http://nokogiri.org/ 0 comments
- Apache Nutch⢠http://nutch.apache.org/#08+June+2013+-+Apache+Nutch+v2.2+Released 0 comments
- GitHub - segmentio/nightmare: A high-level browser automation library. https://github.com/segmentio/nightmare 0 comments
- GitHub - rushter/selectolax: Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors). https://github.com/rushter/selectolax 0 comments
- Jaunt - Java Web Scraping & JSON Querying http://jaunt-api.com/ 0 comments
- Apify Documentation · Apify Documentation | Apify Documentation https://sdk.apify.com/ 0 comments
Related searches:
Search whole site: site:scrapeops.io
Search title: The State of Web Scraping 2022 | ScrapeOps
See how to search.