WeSearch

Show HN: Unsiloed AI – #1 on OlmOCR-Bench,Beats Reducto, LlamaParse and GPT-5.5

·1 min read · 0 reactions · 0 comments · 18 views
#technology#artificial intelligence#document processing#Unsiloed AI#olmOCR-Bench#GPT-5.5#Claude Opus#LlamaParse
⚡ TL;DR · AI summary

Unsiloed AI has developed a new document parser that excels in handling complex real-world challenges. Their latest version, v3.1, achieved the top rank on olmOCR-Bench, outperforming several other OCR services. The parser's performance was evaluated across a large dataset, revealing that many errors were related to formatting rather than OCR accuracy.

Key facts
Original article
Ycombinator
Read full at Ycombinator →
Opening excerpt (first ~120 words) tap to expand

Most of the document parsers fail on real world challenges like complex tables, handwritten documents, historical document scans, equations, multi-column layouts, complex reading order, etc. We built Unsiloed Parser to handle exactly these cases.Our latest parser v3.1 achieved #1 rank and scored 88.0 strict pass-rate on olmOCR-Bench. We ran the evaluation across 1,403 PDFs and 8,413 unit tests using the unmodified upstream Allen AI scorer (olmocr==0.4.27) and found Unsiloed beats 18 other OCR services, including GPT-5.5, Claude Opus 4.7, LlamaParse, Reducto, Azure Document Intelligence, AWS Textract, and Unstructured.When we dug deeper into the failure cases, we found many errors were not OCR errors but things like \frac vs \dfrac, whitespace differences, or equivalent LaTeX renderings.

Excerpt limited to ~120 words for fair-use compliance. The full article is at Ycombinator.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Ycombinator