I built a Mamba1 variant I call SM1 with d_state=1 that runs on Blackwell in pure PyTorch [P]
·
0 reactions
·
0 comments
·
17 views
Original article
r/MachineLearning
Anonymous · no account needed