HeadlinesBriefing favicon HeadlinesBriefing.com

AMD Unveils ROCm 7.2 for AI Workloads

TechPowerUp •
×

AMD rolled out ROCm 7.2 on Jan 23, targeting AI and HPC workloads. The update bundles hipBLASLt and GEMM tuning, adds FP8/FP4 support in rocMLIR and MIGraphX, and introduces topology‑aware communication via GDA and RCCL. These changes promise higher throughput and lower latency for developers worldwide in data centers today.

Optimizations focus on MI300X and MI350 GPUs, delivering measurable gains over ROCm 7.1. New SR‑IOV and RAS features secure multi‑tenant deployments, while Node Power Management improves multi‑GPU efficiency. Together, these enhancements aim to make AMD’s stack more production‑ready for cloud and enterprise AI to meet growing demand for AI inference.

With these updates, developers can train models like GLM‑4.6 and Llama 2 faster, and distributed training scales across 4‑NIC topologies. AMD’s push to low‑precision formats and smarter communication positions it against NVIDIA’s CUDA ecosystem. Next steps include deeper compiler optimizations and broader support for emerging AI workloads and accelerate research in generative AI models.