China bypasses US GPU bans with 1.54-exaflops 'LineShine' supercomputer — CPU-only monster packs 2.4 million Huawei-designed Armv9 cores
…The paper notes that sustaining high utilization of the SME matrix engines required extensive co-design of kernels, runtime scheduling, cache residency management, and tensor placement across the HBM and DDR hierarchy…