December 19, 2023
Understanding GPU Memory 2: Finding and Removing Reference Cycles
This is part 2 of the Understanding GPU Memory blog series. Our first post Understanding GPU Memory 1: Visualizing All Allocations over Time shows how to use the memory snapshot tool. In this part, we will use the Memory Snapshot to visualize a GPU memory leak caused by reference cycles, and then locate and remove them in our code using the Reference Cycle Detector.
December 15, 2023
Empowering Models with Performance: The Art of Generalized Model Transformation Approach
Introduction
December 14, 2023
Understanding GPU Memory 1: Visualizing All Allocations over Time
During your time with PyTorch on GPUs, you may be familiar with this common error message:
November 30, 2023
Accelerating Generative AI with PyTorch II: GPT, Fast
This post is the second part of a multi-series blog focused on how to accelerate generative AI models with pure, native PyTorch. We are excited to share a breadth of newly released PyTorch performance features alongside practical examples to see how far we can push PyTorch native performance. In part one, we showed how to accelerate Segment Anything over 8x using only pure, native PyTorch. In this blog we’ll focus on LLM optimization.