WeSearch

Notes on respectfully getting a personal copy of a website's contents

·2 min read · 0 reactions · 0 comments · 10 views
#blog#technology#web access
⚡ TL;DR · AI summary

The article discusses issues related to accessing a personal blog due to outdated browser versions. It highlights the challenges posed by high-volume crawlers that mimic old browser user agents, leading to access restrictions. The author provides guidance for users experiencing access issues and suggests alternatives for archiving content.

Key facts
Original article
Utoronto
Read full at Utoronto →
Opening excerpt (first ~120 words) tap to expand

You're using a suspiciously old browser You're probably reading this page because you've attempted to access some part of my blog (Wandering Thoughts) or CSpace, the wiki thing it's part of. Unfortunately you're using a browser version that my anti-crawler precautions consider suspicious, most often because it's too old (most often this applies to versions of Chrome). Unfortunately, as of early 2025 there's a plague of high volume crawlers (apparently in part to gather data for LLM training) that use a variety of old browser user agents, especially Chrome user agents. To reduce the load on Wandering Thoughts I'm experimenting with (attempting to) block all of them, and you've run into this.

Excerpt limited to ~120 words for fair-use compliance. The full article is at Utoronto.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Utoronto