Tools & Techniques

Web scraping tool and techniques quick reference

E-commerce Web Scraping Best Practice
TABLE OF CONTENTS Why Using This Best Practice 1.  Preliminary Study 1.1. Technology Stack 1.2. API search 1.3. JSON Search 1.3. Pagination 2. Cod...
Thu, 31 Mar, 2022 at 6:58 AM
Interesting links
This article is a melting pot of intersting stuff we find around, that looks at common issues, or simply tries to draw the future. Developing AI-B...
Tue, 19 Apr, 2022 at 6:39 PM
Playwright
TABLE OF CONTENTS What is Playwright? Our View on Playwright Usage Rating of Playwright Configuration When to use Playwright Reference and document...
Wed, 6 Apr, 2022 at 3:04 PM
Playwright - playwright_stealth
What is playwright_stealth? Playwright module useful for obfuscating the bot and make it seems a regular navigation session Our View on playwright_stealt...
Wed, 6 Apr, 2022 at 2:49 PM
Puppeteer
What is Puppeteer? Puppeteer is a browser automation tool useful for web scraping.  Our View on Puppeteer Usage Rating of Puppeteer 2. SECOND BEST: ...
Thu, 7 Apr, 2022 at 9:10 AM
Scrapy
TABLE OF CONTENTS What is Scrapy? Our View on Scrapy Usage Rating of Scrapy Configurations Our standard and best practices Please read our standa...
Thu, 31 Mar, 2022 at 1:13 AM
Scrapy_proxies (Scrapy module)
TABLE OF CONTENTS What is scrapy_proxies? Our View on scrapy_proxies Usage Rating of scrapy_proxies Settings When to use scrapy_proxies Reference a...
Thu, 31 Mar, 2022 at 1:14 AM
Scrapy_splash (Scrapy module)
TABLE OF CONTENTS What is scrapy_splash? Our View on scrapy_splash Usage Rating of scrapy_splash Configuration When to use scrapy_splash Reference ...
Thu, 31 Mar, 2022 at 1:15 AM
Selenium Webdriver
What is Selenium Webdriver? The Selenium Webdriver is a web application testing suite used also for web scraping Our View on Selenium Webdriver Usage Ra...
Sat, 2 Apr, 2022 at 11:20 AM
Wappalyzer Chrome Extension
What is Wappalyzer Chrome Extension? Wappalyzer is a browser extension that uncovers the technologies used on websites. It detects content management syste...
Tue, 29 Mar, 2022 at 5:27 PM