GPU Kernel Programming

14 小时

32B逆袭GPT-5.2：首个端到端GPU编程智能体框架StitchCUDA问世

结果是显著的：StitchCUDA 将 Hacking 率从 Kevin-32B 的 52% 降至 16%， Hacking 从 4 次降至 0 次。而去除 Rubric 的 StitchCUDA-A 变体，Hacking 率回升至 32%，进一步验证了 Rubric Reward 的因果效应。

腾讯网

迈向可编程观测：在GPU Kernel中构建类eBPF风格的性能探针

本文旨在梳理作者学习路径，带领读者共同探索 GPU Kernel 性能分析从宏观到微观的技术演进。引言作为一名使用eBPF进行CPU性能分析的工程师，在转向学习GPU性能优化分析时，一直在思考GPU上是否有技术也可以实现用户自定义探针式性能分析？学习NVIDIA Nsight ...

新浪网

港科大等DR. KERNEL：强化学习训练系统赋能AI生成高性能GPU代码

这项由香港科技大学、字节跳动、香港中文大学（深圳）以及南洋理工大学联合开展的研究发表于2026年，研究团队开发出了一套完整的训练系统，让大语言模型学会编写高性能的GPU内核代码。这项突破性工作首次系统性地解决了用强化学习训练AI模型编写内核 ...

Nature

Performance Tuning and Auto-Tuning of Algorithms for GPU Kernels

The optimisation of GPU kernels through performance tuning and auto-tuning approaches has become essential in maximising computational efficiency on modern heterogeneous architectures. Researchers ...

腾讯网

GPU到底是如何工作的？这篇AI Infra入门全部告诉你

AI 流行的当下，你有没有想过：大模型推理服务到底怎么跑起来的？大模型推理服务的运行过程中，CPU和GPU分别负责哪些工作？用GPU一定比CPU跑的快么？哪些场景需要用GPU? GPU最初的使命是加速图形渲染。而渲染一帧图像，本质上就是对数百万个像素点进行相似 ...

Linux Journal

Intel Expands Linux Graphics Team to Boost Drivers and Gaming Support

Intel’s renewed effort to hire Linux graphics developers is a positive sign for the open-source community. By expanding its Linux GPU team and focusing on both gaming and high-performance workloads, ...

VentureBeat

TTT-Discover optimizes GPU kernels 2x faster than human experts — by training during ...

Researchers from Stanford, Nvidia, and Together AI have developed a new technique that can discover new solutions to very complex problems. For example, they managed to optimize a critical GPU kernel ...

Electronic Design

Programming The CUDA Architecture: A Look At GPU Computing

Graphics processing units (GPUs) were originally designed to perform the highly parallel computations required for graphics rendering. But over the last couple of years, they’ve proven to be powerful ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果