I scraped 1.94M Airbnb photos for opium dens, pet cameos, and messy kitchens
119 cities x 4 quarterly snapshots of Inside Airbnb data. 1.7M photos through CLIP, the most suspicious shortlists double-checked with Claude Haiku Vision, 50.7M reviews scored through Haiku, all parallelized on Burla.
Opening excerpt (first ~120 words) tap to expand
Burla demo · April 2026 Every Airbnb,looked at all at once. Every public listing in Inside Airbnb's open dump, 119 cities, 4 quarterly snapshots. We scored 1.7M photos with CLIP (a model that turns an image into a vector you can compare to a text prompt), shortlisted the most suspicious ones, and had Claude Haiku Vision double-check each shortlist. We also scored every review and reranked the weirdest 12K with Haiku. Everything was parallelized on Burla, on a single dynamic cluster that scaled to ~1.7K CPU workers for photo download and CLIP, with 20 A100 GPUs running embedding clusters in parallel on the same cluster. --Listings --Photos scraped --Reviews scored --CLIP-scored --GPU detections --Peak workers Listings, reviews, and calendars come straight from public Inside Airbnb dumps.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Github.