[D] Why is CUDA so much faster than ROCm?

old.reddit.com

[D] Why is CUDA so much faster than ROCm?

old.reddit.com

Lemmit.Online bot@lemmit.onlineMB to

Machine Learning@lemmit.onlineEnglish · 2 months ago

Blocked

old.reddit.com

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/machinelearning by /u/evilevidenz on 2024-09-06 06:52:34+00:00.

Usually people respond with “Because NVIDIA had more time and more money”. However, why cant AMD catch up? What are the exact things that make optimizing ROCm so hard??

It would be helpful if you could point to some resources or if your answer would be as detailed as possible regarding the implementation of specific kernels and structures and how CUDA calls are exactly made and optimized from Triton or XLA. Thx :)

You must log in or register to comment.

Chat

Machine Learning@lemmit.online

machinelearning@lemmit.online

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !machinelearning@lemmit.online

Community locked: only moderators can create posts. You can still comment on posts.

This subreddit is temporarily closed in protest of Reddit killing third party apps, see /r/ModCoord and /r/Save3rdPartyApps for more information.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
3 users / 6 months
2 local subscribers
16 subscribers
1.44K Posts
0 Comments
Modlog

mods:
Lemmit.Online bot@lemmit.online