Developer Guides

Originally posted: April 15, 2020

Last updated: October 2, 2023

Technical blogs

GPUOpen chip

Looking for knowledge beyond our software?

Explore our continually-growing library of technical blogs, written by AMD engineers and guest game developers. Benefit from their valuable experience covering general development techniques, developing with AMD hardware, ray tracing, HPC, ML, Vulkan®, DirectX®, Unreal Engine®, and lots more.

Some of our recent popular blogs

Make sure you don’t miss out on some of our perennially popular blogs too!

Don’t forget – you can find blog posts related specifically to our tools, SDKs, and effects in our software blogs.

Browse through all our technical blogs in one place

the_page_ID	Blog title	Description	Originally posted	Author	page_taxonomy_category
~ID-000140	Optimized Reversible Tonemapper for Resolve	Optimized tonemapper form of the technique Brian Karis talks about on Graphics Rants: Tone mapping. Replace the luma computation with max3(red,green,blue).	26th January 2016	Timothy Lottes	Developer guides
~ID-000237	Getting the Most Out of Delta Color Compression	DCC is a domain-specific compression that tries to take advantage of data coherence. It’s lossless, and adapted for 3D rendering. The key idea is to process whole blocks instead of individual pixels.	14th March 2016	Chris Brennan	Developer guides
~ID-001014	Fetching From Cubes and Octahedrons	For GPU-side dynamically generated data structures which need 3D spherical mappings, two of the most useful mappings are cubemaps and octahedral maps. This post explores the overhead of both mappings.	4th February 2016	Timothy Lottes	Developer guides
~ID-001211	Maxing Out GPU usage in nBodyGravity	Asynchronous compute can help you to get the maximum GPU usage. I’ll be explaining the details based on the nBodyGravity sample from Microsoft.	26th January 2016	Matthäus Chajdas	Developer guides
~ID-001861	Understanding Memory Coalescing on GCN	An explanation of how GCN hardware coalesces memory operations to minimize traffic throughout the memory hierarchy.	21st March 2016	Timothy Lottes	Developer guides
~ID-002113	Vulkan® Renderpasses	Renderpasses are objects designed to allow an application to communicate the high-level structure of a frame to the driver.	16th February 2016	Graham Sellers	Developer guides
~ID-002362	Using the Vulkan® Validation Layers	Vulkan validation layers make it easier to catch any mistakes, provide useful information beyond basic errors and minimize portability issues.	9th March 2016	Daniel Rakos	Developer guides
~ID-002779	Unlock the Rasterizer with Out-of-Order Rasterization	GCN hardware supports a special out-of-order rasterization mode which relaxes the ordering guarantee, and allows fragments to be produced out-of-order.	17th May 2016	Matthäus Chajdas	Developer guides
~ID-002814	Using Vulkan® Device Memory	This post serves as a guide on how to best use the various Memory Heaps & Memory Types exposed in Vulkan on AMD drivers, starting with some high-level tips.	21st July 2016	Timothy Lottes	Developer guides
~ID-002901	GCN Shader Extensions for Direct3D® and Vulkan®	One of the mandates of GPUOpen is to give developers better access to the hardware, and this post details extensions for Vulkan and Direct3D12 that expose additional GCN features to developers.	24th May 2016	Matthäus Chajdas	Developer guides
~ID-002904	Fast Compaction with mbcnt	With shader extensions, we provide access to a much better tool to get compaction done: GCN provides a special op-code for compaction within a wavefront.	20th May 2016	Matthäus Chajdas	Developer guides
~ID-003532	The Art of AMDGCN Assembly: How to Bend the Machine to Your Will	This article explains how to produce Hsaco from assembly code and also takes a closer look at some new features of the GCN architecture.	29th June 2016	Ben Sander	Developer guides
~ID-003720	Texel Shading	Game engines do most of their shading work per-pixel or per-fragment. But there is another alternative that has been popular in film for decades…	21st July 2016	Karl Hillesland	Developer guides
~ID-003855	Anatomy Of The Total War Engine: Part I	Tamas Rabel, Lead Graphics Programmer on the Total War series provides a detailed look at the Total War renderer as well as digging deep into some of the optimizations that the team at Creative Assembly did for the brilliant, Total War: Warhammer.	27th July 2016	Tamas Rabel	Developer guides
~ID-003859	Vulkan® and DOOM	This post takes a look at the interesting bits of helping id Software with their DOOM Vulkan effort, from the perspective of AMD’s Game Engineering Team.	10th November 2016	Timothy Lottes	Developer guides
~ID-003919	Anatomy Of The Total War Engine: Part II	Tamas Rabel from Creative Assembly discusses how performance was measured with the Total War Engine.	3rd August 2016	Tamas Rabel	Developer guides
~ID-003953	AMD GCN Assembly: Cross-Lane Operations	Cross-lane operations are an efficient way to share data between wavefront lanes. This article covers in detail the cross-lane features that GCN3 offers.	10th August 2016	Ben Sander	Developer guides
~ID-004082	Anatomy Of The Total War Engine: Part III	Here’s Tamas Rabel again with some juicy details about how Creative Assembly brought Total War to DirectX® 12.	10th August 2016	Tamas Rabel	Developer guides
~ID-004145	Anatomy Of The Total War Engine: Part IV	Tamas Rabel talks about how Total War: Warhammer utilized asynchronous compute to extract some extra GPU performance in DirectX® 12 and delves into the process of moving some of the passes in the engine to asynchronous compute pipelines.	16th August 2016	Tamas Rabel	Developer guides
~ID-004230	Anatomy Of The Total War Engine: Part V	The final instalment in Tamas Rabel’s insight into developing the Total War engine looks at Multi-GPU.	22nd August 2016	Tamas Rabel	Developer guides
~ID-004290	Using RapidFire for Virtual Desktop and Cloud Gaming	RapidFire SDK captures and encodes the input images entirely on the GPU and then copies the encoded result into the system memory for processing on the CPU.	27th September 2016	Bruno Stefanizzi	Developer guides
~ID-004423	Vulkan® Barriers Explained	Barriers control resource and command synchronisation in Vulkan applications and are critical to performance and correctness. Learn more here.	18th October 2016	Matthäus Chajdas	Developer guides
~ID-004487	AMD Driver Symbol Server	How to set up the AMD Driver Symbol Server in Visual Studio.	27th October 2016	Gareth Thomas	Developer guides
~ID-004567	Selecting the Best Graphics Device to Run a 3D Intensive Application	3D intensive application performance may suffer greatly if the best graphics device is not selected. As a developer you can easily fix this problem by adding only one line to your executable’s source code.	16th November 2016	Ken Mitchell	Developer guides
~ID-004755	Leveraging Asynchronous Queues for Concurrent Execution	Understanding concurrency (and what breaks it) is extremely important when optimizing for modern GPUs.	1st December 2016	Stephan Hodes	Developer guides
~ID-004824	Optimizing Terrain Shadows	One thing which is often forgotten is shadow map rendering. As the tessellation level of the terrain is not optimized for the shadow camera, but for the primary camera, this often results in a very strong mismatch and shadow maps end up getting extremely over-tessellated.	15th December 2016	Matthäus Chajdas	Developer guides
~ID-004861	Profiling video memory with Windows® Performance Analyzer	A guide to using the Windows Performance Analyzer tool, with a focus on video resources.	9th February 2017	Cristian Cutocheras	Developer guides
~ID-005536	Using Sub DWord Addressing on AMD GPUs	Sub DWord Addressing is a feature of the AMD GCN architecture which allows the efficient extraction of 8-bit and 16-bit values from a 32-bit register.	24th February 2017	Aditya Atluri	Developer guides
~ID-005567	Live VGPR Analysis with Radeon™ GPU Analyzer	This tutorial explains how to use Radeon GPU Analyzer (RGA) to produce a live VGPR analysis report for your shaders and kernels. Basic RGA usage knowledge is assumed.	21st March 2017	Amit Ben-Moshe	Developer guides
~ID-005928	CPU Core Count Detection on Windows®	Due to architectural differences between Zen and our previous processor architecture, Bulldozer, developers need to take care when using the Windows® APIs for processor and core enumeration.	14th September 2017	Ken Mitchell	Developer guides
~ID-005948	Content Creation Tools and Multi-GPU	mGPU isn’t just for gamers – if you’re a developer working on a game, you should think of using mGPU to make your life easier.	5th May 2017	Matthäus Chajdas	Developer guides
~ID-006013	Optimizing GPU occupancy and resource usage with large thread groups	Sebastian Aaltonen, co-founder of Second Order Ltd, talks about how to optimize GPU occupancy and resource usage of compute shaders that use large thread groups.	24th May 2017	Sebastian Aaltonen	Developer guides
~ID-006359	Understanding Vulkan® Objects	An important part of learning the Vulkan® API is to understand what types of objects are defined by it, what they represent and how they relate to each other.	7th August 2017	Adam Sawicki	Developer guides
~ID-006483	Stable Barycentric Coordinates	The AMD GCN Vulkan extensions allow developers to get access to the barycentric coordinates at the fragment-shader level.	30th August 2017	Rys Sommefeldt	Developer guides
~ID-006564	First Steps When Implementing FP16	Half-precision (FP16) computation is a performance-enhancing GPU technology long exploited in console and mobile devices not previously used or widely available in mainstream PC development.	20th April 2018	Tom Hammersley	Developer guides
~ID-006609	Deferred Path Tracing By Enscape	Insights from Enscape as to how they designed a renderer that produces path traced real time global illumination and can also converge to offline rendered image quality.	6th December 2017	Thomas Schander	Developer guides
~ID-006686	Understanding GPU context rolls	Learn what a context roll on our GPUs is, how they apply to the pipeline and how they’re managed, and what you can do to analyse them and find out if they’re a limiting factor in the performance of your game or application.	29th June 2018	Rys Sommefeldt	Developer guides
~ID-006834	Reducing Vulkan® API call overhead	This guest post, by Arseny Kapoulkine from Roblox, looks at the costs associated with calling various Vulkan functions tens or hundreds of thousands of times per frame, and ways to bring them down.	26th April 2018	Arseny Kapoulkine	Developer guides
~ID-007019	Decoding Radeon™ Vulkan® versions	A guide to using our machine-readable mapping that you can integrate into your software for decoding Radeon™ Vulkan® versions.	2nd August 2018	Rys Sommefeldt	Developer guides
~ID-007193	Using Ryzen™ Threadripper for Game Development – optimising UE4 build times	Guest post by Sebastian Aaltonen, co-founder of Second Order. It covers optimising building the engine and asset production when using AMD Ryzen Threadripper processors.	17th December 2018	Sebastian Aaltonen	Developer guides
~ID-008277	Integrating AMD FidelityFX™ into the Ego Engine	Tom Hammersley from Codemasters talks about integrating FidelityFX into the Ego Engine and implementing Contrast Adaptive Sharpening (CAS).	18th December 2019	Tom Hammersley	Developer guides
~ID-013579	Integrating RenderDoc for Unconventional Apps	One of our engineers explains a few small code changes that can help you integrate RenderDoc for more unconventional applications.	20th July 2020	Matthäus Chajdas	Developer guides
~ID-013823	Porting Detroit: Become Human from PlayStation® 4 to PC – Part 3	The final part of this joint series with Quantic Dream discusses shader scalarization, async compute, multithreaded render lists, memory management using our Vulkan Memory Allocator (VMA), and much more.	25th September 2020	Lou Kramer	Developer guides
~ID-013997	Porting Detroit: Become Human from PlayStation® 4 to PC – Part 1	Porting the PS4® game Detroit: Become Human to PC presented some interesting challenges. This first part of a joint collaboration from engineers at Quantic Dream and AMD discusses the decision to use Vulkan® and talks shader pipelines and descriptors.	21st September 2020	Lou Kramer	Developer guides
~ID-013998	Porting Detroit: Become Human from PlayStation® 4 to PC – Part 2	Part 2 of this joint post between Quantic Dream and AMD looks at non-uniform resource indexing on PC and for AMD cards specifically.	23rd September 2020	Lou Kramer	Developer guides
~ID-016632	AMD Ryzen CPU Performance Guide	Design faster. Render faster. Iterate faster. Our one-stop resource for getting great AMD Ryzen performance.	20th April 2021	GPUOpen	Developer guides
~ID-018196	How to get the most out of Smart Access Memory (SAM)	Smart Access Memory (SAM) provides the CPU with direct access to all video memory. These guidelines help you to improve CPU and GPU performance using SAM.	15th June 2021	Oskar Homburg	Developer guides
~ID-020439	Vulkan’s Best Practice layer now has AMD-specific checks	Introducing AMD checks for the Vulkan® Best Practice validation layer! Find out more about how it now incorporates many of our performance suggestions.	2nd September 2021	Nadav Geva	Developer guides
~ID-021100	Understanding Graphs in Radeon GPU Profiler and GPUView	Find out how to read and understand graphs in Radeon GPU Profiler and GPUView in order to optimize your game more effectively.	3rd December 2021	Adam Sawicki	Developer guides
~ID-024955	Integrating VRS in The Riftbreaker	EXOR Studios and AMD have collaborated to add Variable Rate Shading in The Riftbreaker. Read this guest blog to find out more!	13th May 2022	GPUOpen	Developer guides
~ID-025067	The “why” of multi-resolution geometric representation using Bounding Volume Hierarchy for ray tracing	The benefits of the level of details technique for ray tracing are not trivial. This blog explores the issues, giving the rationale for our new technique.	9th May 2022	Takahiro Harada	Developer guides
~ID-030236	AMD matrix cores (amd-lab-notes)	This first post in the ‘AMD lab notes’ series takes a look at AMD’s Matrix Core technology and how best to use it to speed up your matrix operations.	14th November 2022	amd-lab-notes	Developer guides
~ID-030237	Finite difference method – Laplacian part 1 (amd-lab-notes)	The finite difference method is a powerful tool for computational physics. This post covers how to implement a GPU-accelerated finite difference code using AMD’s HIP API.	14th November 2022	amd-lab-notes	Developer guides
~ID-036749	Finite difference method – Laplacian part 2 (amd-lab-notes)	In this post we introduce two common optimizations that can be applied to the kernel to reduce data movement and bring us closer to the new peak: loop tiling to explicitly reduce memory loads and re-order the memory access pattern to improve caching.	4th January 2023	amd-lab-notes	Developer guides
~ID-037120	AMD RDNA™ Performance Guide	Our one-stop resource for getting great AMD RDNA™ performance on Vulkan® and DirectX®12 APIs!	22nd March 2023	RDNA Perf Guide	Developer guides
~ID-038825	AMD Instinct™ MI200 GPU memory space overview – amd-lab-notes	This post introduces commonly-used memory spaces, identifies what makes each memory space unique, and discusses some common use-cases for each space.	9th March 2023	amd-lab-notes	Developer guides
~ID-038937	Introduction to profiling tools for AMD hardware (amd-lab-notes)	This post gives an overview of AMD’s open source profiling tools, helping you diagnose bottlenecks and understand how your application is using the hardware.	12th April 2023	amd-lab-notes	Developer guides
~ID-039554	Finite Difference Method – Laplacian part 3 – amd-lab-notes	In this third part, we cover additional optimizations to fine tune the performance of the kernel, and introduce temporary files, register pressure, and occupancy.	11th May 2023	amd-lab-notes	Developer guides
~ID-039589	Register pressure in AMD CDNA2™ GPUs – amd-lab-notes	Register pressure of GPU kernels has a tremendous impact on performance. This post provides a practical demo on applying recommendations.	17th May 2023	amd-lab-notes	Developer guides
~ID-041858	Getting Started – GPU Work Graphs	Find out what you need to get started with Work Graphs for DirectX 12, including the software required, configuration, compiling, and more.	22nd June 2023	GPU Work Graphs	Developer guides
~ID-041859	GPU Work Graphs in Microsoft DirectX® 12	Our primer on GPU Work Graphs introduces this exciting new paradigm for graphics developers, which enable a live shader kernel to dispatch new workloads on-demand without needing to circle back around to the CPU first.	22nd June 2023	GPU Work Graphs	Developer guides
~ID-041860	Building Blocks – GPU Work Graphs	Build upon what you learned in Part 1 of Work Graphs with topics such as input and output records, SV_DispatchGrid, NodeLaunch modes, and recursion.	22nd June 2023	GPU Work Graphs	Developer guides
~ID-041861	Tips and Tricks – GPU Work Graphs	Our final part in our GPU Work Graphs primer shares tips and tricks, and links to where you can find out more.	22nd June 2023	GPU Work Graphs	Developer guides
~ID-043432	Pre-multiplication, left-handed coordinate system as in DirectX® 9 – Matrix Compendium	GPUOpen Matrix Compendium: This page shows a selection of matrices in the coordinate system expected by DirectX® 9.	5th April 2023	Matrix Compendium	Developer guides
~ID-043433	Introduction – Matrix Compendium	The GPUOpen Matrix Compendium covers how matrices are used in 3D graphics and implementations in host code and shading languages. It’s a growing guide, so keep checking back!	5th April 2023	Matrix Compendium	Developer guides
~ID-043434	Pre-multiplication, right-handed coordinate system – Matrix Compendium	GPUOpen Matrix Compendium: This page shows a selection of matrices in a pre-multiplication, right-handed coordinate system.	5th April 2023	Matrix Compendium	Developer guides
~ID-043435	Post-multiplication, right-handed coordinate system as in OpenGL® – Matrix Compendium	GPUOpen Matrix Compendium: This page shows a selection of matrices in the coordinate system expected by OpenGL®.	5th April 2023	Matrix Compendium	Developer guides
~ID-043436	Post-multiplication, left-handed coordinate system – Matrix Compendium	GPUOpen Matrix Compendium: This page shows a selection of matrices in a post-multiplication, left-handed coordinate system.	5th April 2023	Matrix Compendium	Developer guides
~ID-044800	Effective Use of the New D3D12_HEAP_TYPE_GPU_UPLOAD	The D3D12_HEAP_TYPE_GPU_UPLOAD flag in Direct3D 12 provides a good alternative to other ways of uploading data from the CPU to the GPU. Check out our quick guide to effective use of this flag.	17th July 2023	GPUOpen	Developer guides
~ID-044925	Finite difference method – Laplacian part 4 – AMD lab notes	In the fourth and final part of Finite Difference Laplacian blog series we cover scaling studies and cache size limitations	18th July 2023	amd-lab-notes	Developer guides
~ID-045076	New Work Graphs sample & RGP support for GPU Work Graphs	“D3D12SimpleClassify” shows the use of a GPU Work Graph in a simple frame-based graphics application, plus learn about new RGP support.	29th August 2023	GPUOpen	Developer guides
~ID-045459	CPU profiling for Unity	This is a general guide focusing on CPU profiling for Unity, including which tools are useful for profiling and how to use these tools to find hotspots in your code.	5th January 2024	GPUOpen	Developer guides
~ID-046795	Creating a PyTorch/TensorFlow Code Environment on AMD GPUs – AMD lab notes	The machine learning ecosystem is quickly exploding and this article is designed to assist data scientists/ML practitioners get their machine learning environments up and running on AMD GPUs.	11th September 2023	amd-lab-notes	Developer guides
~ID-053039	Work graphs API – compute rasterizer learning sample	Learn more about the power of work graphs API in our detailed blog, taking you step-by-step through an example which implements a scanline rasterizer.	13th October 2023	GPUOpen	Developer guides
~ID-053450	Sparse matrix vector multiplication – part 1 – AMD lab notes	Sparse matrix vector multiplication (SpMV) is a core computational kernel of nearly every implicit sparse linear algebra solver. This is the first post in the series covering SpMV.	3rd November 2023	amd-lab-notes	Developer guides
~ID-053666	How do I become a graphics programmer? – A small guide from the AMD Game Engineering team	It is often difficult to know where to start when taking your first in the world of graphics. This guide is here to help with a discussion of first steps and a list of useful websites.	22nd November 2023	GPUOpen	Developer guides
~ID-053703	Occupancy explained	In this blog post we will try to demystify what exactly occupancy is, which factors limit occupancy, and how to use tools to identify occupancy-limited workloads.	20th December 2023	GPUOpen	Developer guides
~ID-055299	Unreal Engine performance guide	Our one-stop guide to performance with Unreal Engine.	14th December 2023	GPUOpen	Developer guides
~ID-055680	From vertex shader to mesh shader – Mesh shaders on AMD RDNA™ graphics cards	This post is the start of a new series which aims to demystify mesh shaders through examples and tutorials.	19th December 2023	mesh_shaders	Developer guides
~ID-055681	Optimization and best practices – Mesh shaders on AMD RDNA™ graphics cards	The second post in this series on mesh shaders covers best practices for writing mesh and amplification shaders, as well as how to use the AMD Radeon™ Developer Tool Suite to profile and optimize mesh shaders.	16th January 2024	mesh_shaders	Developer guides
~ID-055682	Mesh shaders on AMD RDNA™ graphics cards	This blog series provides detailed explanations, analysis, use-case examples, tutorials, and advice about mesh shading.	19th December 2023	mesh_shaders	Developer guides
~ID-056612	Font and vector-art rendering with mesh shaders – Mesh shaders on AMD RDNA™ graphics cards	The third post in our mesh shaders series covers how to use mesh shaders to simplify font rendering.	13th March 2024	mesh_shaders	Developer guides
~ID-056763	Procedural grass rendering – Mesh shaders on AMD RDNA™ graphics cards	The fourth post in our mesh shaders series takes a look at the specific example of rendering detailed vegetation.	20th March 2024	mesh_shaders	Developer guides
~ID-056991	Affinity part 1 – Affinity, placement, and order – AMD lab notes	This first part introduces the concept of affinity and why its important for achieving better performance on AMD GPU nodes	17th April 2024	amd-lab-notes	Developer guides
~ID-056992	Affinity part 2 – System topology and controlling affinity – AMD lab notes	This second part introduces common tools to understand the topology of your system and to control affinity for different applications	17th April 2024	amd-lab-notes	Developer guides
~ID-057250	Preface to the CPU performance optimization guide	This article starts a series of posts about CPU performance analysis and optimization methods.	18th June 2024	CPU Performance Optimization Guide	Developer guides
~ID-057251	CPU performance optimization guide – part 1 – branch prediction		18th June 2024	CPU Performance Optimization Guide	Developer guides
~ID-063162	CPU performance optimization guide – part 2 – cache invalidation		18th November 2024	CPU Performance Optimization Guide	Developer guides
~ID-064452	CPU performance optimization guide – part 4	Optimize CPU performance by manually writing x64 assembly code, offering a detailed comparison with compiler-generated instructions and achieving improved performance through streamlined instruction sets.	25th March 2025	CPU Performance Optimization Guide	Developer guides
~ID-064456	CPU performance optimization guide – part 3	We look at optimizing CPU performance by reducing the number of instructions, and highlights methods to enhance instruction efficiency and algorithm throughput.	25th March 2025	CPU Performance Optimization Guide	Developer guides

Don’t forget – you can find blog posts related specifically to our tools, SDKs, and effects in our software blogs.

We have lots more documentation for you to discover!

Don’t miss our manual documentation! And if slide decks are what you’re after, you’ll find 100+ of our finest presentations here.

Browse all our useful samples. Perfect for when you’re needing to get started, want to integrate one of our libraries, and much more.

Words not enough? How about pictures? How about moving pictures? We have some amazing videos to share with you!

The home of great performance and optimization advice for AMD RDNA™ 2 GPUs, AMD Ryzen™ CPUs, and so much more.

Our handy software release blogs will help you make good use of our tools, SDKs, and effects, as well as sharing the latest features with new releases.