WeSearch
Hub / Tags / Scraping
TAG · #SCRAPING

Scraping coverage.

Every story in the WeSearch catalog tagged with #scraping, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

51 stories tagged with #scraping, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag →   or   search "Scraping"

RELATED TAGS
#webscraping20#python11#web-scraping9#ai9#technology8#data6#automation5#data-scraping4#javascript3#webdev3#dataengineering3#cybersecurity2
INCLUDE SECURITY RESEARCH BLOG

The smart TV in your living room is a node in the AI scraping economy

In this post we look under the hood of BrightData's SDK and how it turns ordinary consumer TVs into exit nodes of an enormous commercial, residential proxy network leveraged by the…

9 views ·
#technology#privacy#artificial intelligence
DEV.TO (TOP)

Twitch Chat Scraper: export any VOD's full chat replay for $1.05/1K

Quick answer: Twitch stores a complete timestamped chat replay for every public VOD but exposes no...…

12 views ·
#twitch#webscraping#data
DEV.TO (TOP)

Threads Reply Scraper: export the full conversation tree of any public post

Quick answer: Meta's official Threads API is gated behind a developer-account review and refuses...…

11 views ·
#webscraping#data#socialmedia
DEV.TO (TOP)

Steam Regional Price Data: fetch 60 regions in one run for $1.05/1K

Quick answer: Steam publishes regional prices on the public store.steampowered.com/api/appdetails...…

9 views ·
#gaming#data#webscraping
DEV.TO (TOP)

When scraping orchestration is the wrong abstraction for LLM workflows

LLM apps often need structured web data, not a scraping platform. Here's how to choose between orchestration and a simple extraction API.…

9 views ·
#llm#webscraping#api
DEV.TO (TOP)

How I Built a Google Shopping Scraper with Python & Playwright

Why I Built This I wanted to compare prices across Google Shopping without clicking through 100 tabs...…

9 views ·
#python#playwright#webscraping
DEV.TO (TOP)

HTTP 200 Is a Lie: A 30-Line Schema Canary for Source Drift

A scraper that returns HTTP 200 is not a scraper that returns good data. Those are two different...…

8 views ·
#webscraping#dataengineering#api
DEV.TO (TOP)

I Tested Every Web Scraping Tool Against Lazada — Here's What Actually Works (May 2026)

Installing Scrapling on a 4GB VPS: pitfalls, wiring into an AI agent via MCP, browser selection benchmarks, the Camoufox surprise, and a priority ladder validated against Lazada's …

11 views ·
#webscraping#python#ai
R/SAAS

finally launched my tool to find content gaps on google (stop manual serp scraping lol)

15 views ·
BAILII

AI and the Courts – A Cautionary Tale

16 views ·
#ai#technology#web scraping
DEV.TO (TOP)

The Invisible Digital Footprint: Finding Your Face Without Scraping the Web

Every time you walk through a crowded tourist spot or attend a stadium concert, you become an extra...…

8 views ·
#webdev#privacy#programming
DEV.TO (TOP)

How I scraped the CQC Care Register without hitting the API auth wall

The Care Quality Commission regulates 56,000+ healthcare and social care locations in England — care...…

11 views ·
#data scraping#healthcare#technology
DEV.TO (TOP)

How I built an Ofsted school data API on Apify (without scraping a single webpage)

Most scraping projects start by finding a website to scrape. This one started from the opposite...…

14 views ·
#webdev#datascience#opensource
GOOGLE NEWS

VerticalScope sues OpenAI, claims AI giant infringed copyright by scraping content to train GPT models - Law Times

Comprehensive up-to-date news coverage, aggregated from sources all over the world by Google News.…

12 views ·
DEV.TO (TOP)

How I Built an AI-Powered Google Maps Scraper for Lead Generation

The Problem Every sales team needs local business leads, but manually searching Google...…

13 views ·
#webdev#ai#lead generation
DEV.TO (TOP)

SiteRows example #1:

Hello world! I'm starting this series of examples/use-cases of siterows.com, the new app I recently...…

8 views ·
#webscraping#python#sql
DEV.TO (TOP)

Data Normalization Across Dublin Rental Portals: How to Make Listings Comparable

Data Normalization Across Dublin Rental Portals: How to Make Listings Comparable Dublin...…

11 views ·
#data#dataengineering#webscraping
DEV.TO (TOP)

How to Build Token-Efficient Web Scraping Pipelines for AI Agents Using n8n

TL;DR Building token-efficient scraping pipelines for AI agents requires stripping heavy...…

12 views ·
#webscraping#automation#ai
DEV.TO (TOP)

Best Java Web Scraping Libraries

TL;DR Pick Java web scraping libraries based on the target page structure, not on...…

13 views ·
#java#web scraping#libraries
LWN.NET

Fighting the AI Scraperbot Scourge

There are many challenges involved with running a web site like LWN. Some of them, such as fin [...]…

18 views ·
#technology#ai
DEV.TO (TOP)

Optimizing Stealth Browser Fingerprint Integrity and Session Auth

Maintaining execution stealth requires strict alignment between browser fingerprint headers and...…

14 views ·
#cybersecurity#webscraping#python
REAL PYTHON

Quiz: Python Web Scraping

Revisit Requests, Beautiful Soup, Scrapy, and Selenium in this wrap-up quiz covering the Python Web Scraping learning path.…

12 views ·
#python#web scraping#quiz
R/WEBDEV

google news api vs scraping results directly

12 views ·
DEV.TO (TOP)

Why Your Requests + BeautifulSoup Stack Will Fail in Production

TL;DR — requests plus BeautifulSoup is the right tool for tutorials, side projects, and one-off...…

18 views ·
#webscraping#automation#softwareengineering
DEV.TO (TOP)

Why Real Browser Automation Is Replacing Simple HTTP Scraping

*The production problem * Simple HTTP scraping still works for a lot of pages. If a site returns...…

13 views ·
#webscraping#automation#technology
DEV.TO (TOP)

Your recurring scraper is re-downloading data that didn't change. Here's the 15-line fix (conditional GET)

Note: This is a cross-post. Canonical version (full long-form) lives on my blog:...…

20 views ·
#webscraping#python#ai
DEV.TO (TOP)

How to know if you actually need mobile proxies (without buying any)

Every scraping project I start, the same question comes up: do I actually need mobile proxies for...…

14 views ·
#webscraping#proxies#technology
DEV.TO (TOP)

BeautifulSoup and Requests for Web Scraping With Python: When Simple Still Works

Not every data collection workflow requires browser automation or complex network impersonation. For...…

16 views ·
#webscraping#python#backend
DEV.TO (TOP)

Open-source Playwright wrapper that passes bot.sannysoft.com, pixelscan, and CreepJS in headless mode

Been scraping for a while and got tired of getting blocked the moment a page loads. Standard...…

10 views ·
#automation#opensource#python
DEV.TO (TOP)

The End of Web Scraping: Introducing WebMCP & Chrome DevTools for Agents

A raw, developer-first look at Google’s proposed WebMCP open standard and Chrome DevTools for Agents...…

12 views ·
#webmcp#chrome#devtools
DEV.TO (TOP)

Stop Fighting the DOM. Selector-First Thinking Will Save Your Scraper.

Most broken scrapers I see have the same shape: someone wrote the extraction logic first and the...…

12 views ·
#webdev#javascript#programming
DEV.TO (TOP)

Advanced Web Scraping with Power Query: Automating Data Extraction for SEO and Analytics

A technical guide to building robust data extraction pipelines using Power Query to automate your SEO auditing and analytics dashboards.…

13 views ·
#web scraping#data extraction#power query
RAILWAY

CodeShot – Web screenshots,scraping, and link previews for AI agents

14 views ·
PACKETSTREAM - PACKETSTREAM -

AI companies use malware proxies to mount DDoS attacks on web sites

Affordable web scraping proxies for AI startups. Cut costs with $1/GB residential IPs. Scalable, no minimums, fast setup with PacketStream.…

11 views ·
#ai#web scraping#startups
R/WEBDEV

[Showoff Saturday] Web scraping for LLMs was driving us insane, so we built our own Search API with native MCP support

21 views ·
DEV.TO (TOP)

Google Maps Scraper: Build Local Data Pipelines That Actually Run

You do not need another CSV export that works once and quietly dies three days later. A Google Maps...…

14 views ·
#dataengineering#webscraping#automation
DEV.TO (TOP)

HTML Tables with Hidden Data: Scraping What You Can't See

The table shows 10 columns. You export it. The CSV has 10 columns. But the page has 15 columns of...…

10 views ·
#webdev#javascript#data-scraping
WORLD NEWS | THE GUARDIAN

Trump’s allies in danger of scraping false hope from Maga victory in Kentucky primary

US president, like a cult leader whose commune keeps getting smaller, commands fierce loyalty from a shrinking base “Thomas Massie caught in a throuple!” screamed the AI-generated …

17 views ·
#politics#elections#trump
LKML

Linus Torvalds on the continued flood of AI bug reports

14 views ·
#technology#ai#web scraping
R/SELFHOSTED

Maxun v0.0.40: Support for PDF extraction & parsing in our open-source, self-hostable no-code scraping platform!

12 views ·
FOURA BLOG

Post-quantum TLS rolled out last January and broke most open-source scrapers

Your User-Agent header doesn't matter anymore. JA4 fingerprints classify bots at 98.6% accuracy before headers are even read. Here's what shifted in 2026.…

13 views ·
#web scraping#tls#bot detection
REVOLUTION-NETWORK

Show HN: Decentralized compute API on DePIN – scraping, OCR, JavaScript sandbox

Scraping, OCR, code execution and data processing via one API. 3 GB free/week.…

13 views ·
DEV.TO (TOP)

Three memory-leak patterns in long-running scrapers (and how I caught them after 968 Trustpilot runs)

Memory leaks in scrapers do not crash the run. They quietly bump the Apify Memory limit from 1 GB to...…

20 views ·
#webscraping#python#memoryleaks
DEV.TO (TOP)

Scraping dynamic pages with Python, Playwright and AWS Lambda

A practical guide to scraping dynamic JavaScript-heavy pages with Python, Playwright, and AWS Lambda, then saving scheduled parquet snapshots to S3.…

14 views ·
#data#python
KERNEL

Killswitch: Add per-function short-circuit mitigation primitive

13 views ·
#web security#ai scraping#proof-of-work
YCOMBINATOR

Tell HN: Mindie.dev is scraping emails from profiles to send spam

13 views ·
#spam#email#privacy
GITHUB

It's set up, not setup: Scraping GitHub for grammar errors

An exploration of how often the noun 'setup' is used where the verb 'set up' belongs, across 30,000+ public GitHub repositories.…

11 views ·
DEV.TO (TOP)

I built 5 single-platform scrapers. The one that sells fastest is the aggregator that wraps them.

I run a small portfolio of public scrapers on the Apify Store. Most of them are single-platform — one...…

12 views ·
#web scraping#python#indie hacking
R/PROMPTENGINEERING

I've been scraping viral image-gen prompts off X for weeks — here's what I learned about why most "copy this prompt" promises fail, and the tool I built to fix it

9 views ·
VULPINECITRUS

The Day I Logged 1 In Every 2000 Public IPv4: Visualizing The AI Scraper DDoS

In an attempt to grasp the magnitude of web scraper attacks against my websites, i went the way of visualizing.…

13 views ·
#web scraping#ddos#cybersecurity
YCOMBINATOR

Scraping 241 UK council planning portals – 2.6M decisions so far

7 views ·
#data scraping#planning#local government