令人惊叹的是,johnnytshi在短短30分钟内,就把整个CUDA后端移植到了AMD ROCm上,而且中间没用任何翻译层。 另外一个好处当然是,不用费劲去搭像Hipify这种复杂的翻译环境了;直接在命令行(CLI)里就能干活。
近日,科技圈被一则消息彻底引爆:Reddit平台上,开发者johnnytshi分享了一项颠覆性操作—— Claude Code仅耗时30分钟,便成功将一整套完整的CUDA后端代码,移植到了AMD的ROCm平台,而且无需任何中间转换层。
当前GPU编程语言竞争本质是生态控制权的争夺。CUDA Tile通过整合英伟达硬件资源构建技术壁垒,TileLang则凭借开放特性重塑开发范式。这场变革不仅影响硬件厂商的市场格局,更将决定AI开发者能否摆脱"铲子决定模型"的被动局面。随着跨平台编译技术的持续进化,未来GPU计算生态或将呈现多极化发展态势,开发者工具链的选择自由度将成为影响产业走向的关键因素。 返回搜狐,查看更多 ...
“矩阵乘法是英伟达 CUDA 生态最核心的护城河之一。而我们打造的 CUDA-L2 在大规模、系统性的评测中,超越英伟达针对该核心算子的闭源优化方案。我们不仅实现了超越,而且将方法开源,这对于打破技术壁垒具有标志性意义。”DeepReinforce ...
DeepSeek-R1生成自定义CUDA内核,性能领先优化GPU编程。 【导读】斯坦福和普林斯顿研究者发现,DeepSeek-R1生成的自定义CUDA内核,完爆了o1和Claude 3.5 ...
DeepSeek公司近期在AI技术领域的动态引起了广泛关注。据Tom’s Hardware等外媒报道,该公司正在紧锣密鼓地开发一款大语言模型,而令人瞩目的是,该项目已经成功绕过了英伟达广受欢迎的CUDA框架。 这一技术选择被业内视为DeepSeek为未来兼容国产GPU芯片所做的前瞻 ...
Back in 2020, AMD announced it was splitting its post-GCN architecture into RDNA for gaming, with CDNA for its data center GPUs, with CDNA later being the architecture of its Radeon Instinct AI ...
TL;DR: NVIDIA is celebrating CES 2025 with GeForce Giveaways, featuring the GeForce 8800 Ultra, the first CUDA GPU, signed by CEO Jensen Huang. Participants can enter by commenting on a post on X. The ...
Enterprises Can Use Microservices to Accelerate Data Processing, LLM Customization, Inference, Retrieval-Augmented Generation and Guardrails Adopted by Broad AI Ecosystem, Including Leading ...
资本市场正在力捧GPU的国产叙事。既然错过了上一个,这一个绝不能放过。 这一次,高光属于沐曦股份。 今天(11月17日),沐曦股份登陆A股,首日高开568.83%,报700元,按开盘价计算,中一签可赚29.77万元。这超过了摩尔线程的26.79万元,刷新了A股单签收益纪录。
We all know CUDA is currently king of the hill when it comes to GPGPU & ML in particular, and that CUDA is an NVIDIA product limited to NVIDIA hardware, and that Apple & NVIDIA “don’t get along” i.e.
Share on Facebook (opens in a new window) Share on X (opens in a new window) Share on Reddit (opens in a new window) Share on Hacker News (opens in a new window) Share on Flipboard (opens in a new ...