Kore: Binary File Format Optimized for Modern Data Systems (Open Source)

May 30, 2026 · 8:54 PM UTC ·1 min read · 0 reactions · 0 comments · 10 views

#data #technology #open source #Kore #Spark #Rust

Kore: Binary File Format Optimized for Modern Data Systems (Open Source)

⚡ TL;DR · AI summary

Kore is a new high-performance binary file format designed for modern data systems, particularly optimized for analytical workloads. It boasts a 38% compression ratio and significantly enhances query speed, achieving a 131x speedup with advanced features like column pruning and predicate pushdown. The format also supports native integration with Spark and provides tools for easy data reading and writing.

Key facts

▪Kore offers a 38% compression ratio compared to 63% for Parquet.
▪It provides a 131x query speedup through features like column pruning and predicate pushdown.
▪The format includes native integration with Spark for efficient data handling.

Original article

GitHub

Read full at GitHub →

Opening excerpt (first ~120 words) tap to expand

🚀 Kore — Killer Optimized Record Exchange The fastest, most compressed columnar format for big data | v0.1.0 KORE is a high-performance binary file format optimized for analytical workloads. It provides: 38% compression ratio (vs 63% for Parquet) 131x query speedup with column pruning & predicate pushdown Zero data loss verification (400K+ cells tested) Native Spark integration — read/write with PySpark Quick Start Rust Library Add this crate as a dependency (when published) or include from path: use kore_fileformat::*; // Write data kore_write_simple("output.kore", schema_json, data_json)?; // Read data let data = kore_read_simple("output.kore")?; // Read specific column let col = kore_read_col_simple("output.kore", "column_name")?; // Get file info let info =…

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed

Discussion

0 comments

Kore: Binary File Format Optimized for Modern Data Systems (Open Source)

Discussion

More from GitHub