AMD Heterogeneous CPU Design Topology Patches Coming For Linux 6.13 – Phoronix
AMD has been on a journey to push heterogeneous CPU design topologies, starting with Zen 3’s 3D V-Cache and culminating with the “Zen 4” Ryzen 7000 series that introduced the “3D Stacked Die” and “Zen 4c” designs. To address these emerging architectures, Linux 6.13 is getting new topology-aware scheduling improvements for performance.
The initial Zen 3 3D V-Cache, while the first of its kind to reach mainstream availability, offered a relatively minor topological complexity – adding extra L3 cache onto the existing chiplet. The introduction of “3D Stacked Die” technology, however, completely alters the system’s topology, requiring an innovative approach from the Linux kernel. This 3D stacked technology, pioneered by AMD, integrates multiple processor cores directly atop one another on the same die, further boosting performance. This complex structure introduces a distinct challenge for scheduling, as each core operates with specific timing properties, further enhancing the need for intricate scheduling strategies to ensure efficient performance.
To meet this challenge head-on, the Linux kernel has been steadily incorporating enhancements for supporting 3D Stacked Die systems, primarily focusing on their unique latency profile. Specifically, Linux kernel developers have added a new scheduling “topology” called “amd_stack_die” within the “toplogy_domain_data” framework, aiming to explicitly handle the unique performance characteristics of the 3D Stacked Die architecture. The current approach categorizes processors into four “stacks”, corresponding to the layers within a single processor die. Each stack then contains its respective processor cores. The implementation provides “topology_domain” information to identify and handle the layers. Notably, a specific scheduler class named “topology_domain_sched” has been developed, specifically dedicated to handling 3D Stacked Die topologies. Its primary responsibility is to track task migration based on the layer information associated with each core within the stacked die structure. Furthermore, within the scheduler framework, the introduction of “topology_domain_cpu” functions has extended the domain API. This enhancement aims to facilitate data sharing within a particular domain for more effective workload management. The current implementation prioritizes assigning processes to the highest-performing layer within the die, ultimately maximizing system performance.
A related set of changes that are coming in Linux 6.13 include further optimizations to the scheduling framework that should also benefit performance with the AMD Zen 4 “3D Stacked Die” and Zen 4c designs. This change is focused on improving how task affinity settings are implemented and used by the Linux scheduler with such modern architectures.
For performance, the “topology_domain_sched” and the other enhancements of Linux 6.13 are likely to improve single-threaded performance with a workload spread across all the different layers of the processor. At the same time, there could be other performance changes. This will ultimately come down to how applications behave with such a complex heterogeneous CPU architecture.
More comprehensive coverage will come after the upcoming Linux 6.13 kernel is released in early October and the new functionality is properly tested with different applications and workloads. The Linux 6.13 merge window for this development branch is slated to end in just over two weeks. It will then follow the normal development cadence of about six weeks until the next stable release of the kernel is available.

