Exact UNORM8 to Float

Jun 3, 2026 · 8:29 AM UTC ·5 min read · 0 reactions · 0 comments · 30 views

TL;DR · WeSearch summary

The article discusses the conversion of 8-bit unsigned integers to floating-point numbers in GPUs. It highlights the challenges and methods for achieving exact conversions without using doubles or division. The author proposes using a geometric series to approximate the conversion while maintaining precision.

Key facts

▪GPUs convert UNORM formats represented as 8-bit unsigned integers to floating-point numbers by dividing by 255.
▪Exact conversion is acknowledged as expensive, leading to alternative methods being explored.
▪The author suggests using a geometric series to achieve an exact conversion without relying on doubles or division.

Original article

The ryg blog

Read full at The ryg blog →

Opening excerpt (first ~120 words) tap to expand

GPUs support UNORM formats that represent a number inside [0,1] as an 8-bit unsigned integer. In exact arithmetic, the conversion to a floating-point number is straightforward: take the integer and divide it by 255. 8-bit integers are for sure machine numbers (exactly represented) in float32 and so is 255, so if you’re willing to do a “proper” divide, that’s the end of it; both inputs are exact, so the result of the division is the same as the result of the computation in exact arithmetic rounded to the nearest float32 (as per active rounding mode anyway), which is the best we can hope for.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at The ryg blog.

Anonymous · no account needed

Discussion

0 comments

Exact UNORM8 to Float

Discussion

More from The ryg blog