WeSearch

Kore: Binary File Format Optimized for Modern Data Systems (Open Source)

·1 min read · 0 reactions · 0 comments · 10 views
#data#technology#open source#Kore#Spark#Rust
Kore: Binary File Format Optimized for Modern Data Systems (Open Source)
⚡ TL;DR · AI summary

Kore is a new high-performance binary file format designed for modern data systems, particularly optimized for analytical workloads. It boasts a 38% compression ratio and significantly enhances query speed, achieving a 131x speedup with advanced features like column pruning and predicate pushdown. The format also supports native integration with Spark and provides tools for easy data reading and writing.

Key facts
Original article
GitHub
Read full at GitHub →
Opening excerpt (first ~120 words) tap to expand

🚀 Kore — Killer Optimized Record Exchange The fastest, most compressed columnar format for big data | v0.1.0 KORE is a high-performance binary file format optimized for analytical workloads. It provides: 38% compression ratio (vs 63% for Parquet) 131x query speedup with column pruning & predicate pushdown Zero data loss verification (400K+ cells tested) Native Spark integration — read/write with PySpark Quick Start Rust Library Add this crate as a dependency (when published) or include from path: use kore_fileformat::*; // Write data kore_write_simple("output.kore", schema_json, data_json)?; // Read data let data = kore_read_simple("output.kore")?; // Read specific column let col = kore_read_col_simple("output.kore", "column_name")?; // Get file info let info =…

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from GitHub