We added W8A8 activation quantization to MLX — prefill went from 2.84s to 2.52s on M5 Pro
·
0 reactions
·
0 comments
·
16 views
Original article
r/LocalLLaMA
Anonymous · no account needed