Scalar and Binary Quantization for Pgvector Vector Search and Storage (2024)
The article discusses scalar and binary quantization techniques in pgvector 0.7.0 to reduce storage and memory usage for high-dimensional vectors in PostgreSQL. These methods compress vector data by reducing dimension precision or converting values to binary, enabling more efficient indexing and search at the cost of potential accuracy tradeoffs. The upcoming release supports 2-byte floats and bit vectors, allowing larger vector dimensions and improved scalability for AI/ML workloads.
- ▪A 1,536-dimensional fp32 vector requires 6KiB of storage, and one million such vectors consume 5.7GB without indexing.
- ▪Scalar quantization reduces vector dimension size (e.g., 4-byte float to 1-byte integer), while binary quantization converts values to single bits based on sign.
- ▪pgvector 0.7.0 introduces support for halfvec (2-byte floats) and bit vectors, enabling indexing of vectors up to 4,000 and 64,000 dimensions respectively.
- ▪Quantization reduces storage and memory needs but may impact search relevancy or performance due to information loss.
- ▪The techniques leverage PostgreSQL expression indexes and HNSW to balance efficiency, scalability, and tradeoffs in vector search workloads.
Opening excerpt (first ~120 words) tap to expand
HomePostsTalksAboutScalar and Binary Quantization for Pgvector Vector Search and Storage Tue, Apr 9, 2024 21-minute readWhile many AI/ML embedding models generate vectors that provide large amounts of information by using high dimensionality, this can come at the cost of using more memory for searches and more overall storage. Both of these can have an impact on the cost and performance of a system that’s storing vectors, including when using PostgreSQL with the pgvector for these use cases.When I talk about vector search in PostgreSQL, I have a slide that I like to call “no shortcuts without tradeoffs” that calls out the different challenges around searching vectors in a database.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Jonathan Katz.