THANK YOU to the developers, engineers, educators, and attendees who visited our booth at PyTorch Conference 2025 this week. We showcased new open-source AI tools that are expanding possibilities in a multi-accelerator world: https://ibm.co/6041BzG07 ▶️ A key highlight was our work with AMD and Red Hat to build Triton-based kernels for vLLM—enabling efficient, hardware-agnostic inference across GPU platforms without proprietary libraries. This effort, presented by Thomas Parnell and Aleksandr M., strengthens support for open, extensible systems in the PyTorch and vLLM communities. ▶️ Researcher Linsong Chu and his team shared a major training milestone: Using torchtitan, a PyTorch-native training framework, they successfully trained one of the first Llama 3-70B-derived models from an open repository—achieving comparable quality with just one-third of the original training budget, half the token count, and FP8 low-precision quantization. ▶️ Lastly, but certainly not least, IBM Research distinguished engineer Mudhakar Srivatsa spotlighted our adoption of vLLM and torch.compile to integrate emerging accelerators like the IBM Spyre AI accelerator. IBM Research is working on a Spyre backend compiler and vLLM plugin with paged attention, with a goal of boosting memory efficiency and scalability for LLM inference. Follow the link above for a full recap #PyTorchCon and of how we’re expanding AI model training and inference for the open-source community.
-
-
-
-
-
+5