WeSearch

Exact UNORM8 to Float

·5 min read · 0 reactions · 0 comments · 9 views
#gpu#conversion#floating-point
Exact UNORM8 to Float
⚡ TL;DR · AI summary

The article discusses the conversion of 8-bit unsigned integers to floating-point numbers in GPUs. It highlights the challenges and methods for achieving exact conversions without using doubles or division. The author proposes using a geometric series to approximate the conversion while maintaining precision.

Key facts
Original article
The ryg blog
Read full at The ryg blog →
Opening excerpt (first ~120 words) tap to expand

GPUs support UNORM formats that represent a number inside [0,1] as an 8-bit unsigned integer. In exact arithmetic, the conversion to a floating-point number is straightforward: take the integer and divide it by 255. 8-bit integers are for sure machine numbers (exactly represented) in float32 and so is 255, so if you’re willing to do a “proper” divide, that’s the end of it; both inputs are exact, so the result of the division is the same as the result of the computation in exact arithmetic rounded to the nearest float32 (as per active rounding mode anyway), which is the best we can hope for.

Excerpt limited to ~120 words for fair-use compliance. The full article is at The ryg blog.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from The ryg blog