Automatic Core Specialization for AVX-512 Applications
-
Autor:
Mathias Gottschlag, Peter Brantsch, Frank Bellosa
-
Quelle:
SYSTOR 2020, 13th ACM International Systems and Storage Conference, Haifa, Israel, October 13-15, 2020
-
Abstract:
Advanced Vector Extension (AVX) instructions operate on wide SIMD vectors. Due to the resulting high power consumption, recent Intel processors reduce their frequency when executing complex AVX2 and AVX-512 instructions.Following non-AVX code is slowed down by this frequency reduction in two situations: When it executes on the sibling hyperthread of the same core in parallel or – as restoring the non-AVX frequency is delayed – when it directly follows the AVX2/AVX-512 code. As a result, heterogeneous workloads consisting of AVX-512 and non-AVX code are frequently slowed down by 10% on average.
In this work, we describe a method to mitigate the frequency reduction slowdown for workloads involving AVX-512 instructions in both situations. Our approach employs core specialization and partitions the CPU cores into AVX-512 cores and non-AVX-512 cores, and only the former execute AVX-512 instructions so that the impact of potential frequency reductions is limited to those cores. To migrate threads to AVX-512 cores, we configure the non-AVX-512 cores to raise an exception when executing AVX-512 instructions. We use a heuristic to determine when to migrate threads back to non-AVX-512 cores. Our approach is able to reduce the frequency reduction overhead by 70% for an assortment of common benchmarks.Bibtex:
@inproceedings{gottschlag20AVX-512,
author = {Gottschlag, Mathias, Brantsch, Peter and Bellosa, Frank},
title = {Automatic Core Specialization for AVX-512 Applications},
booktitle = {Proceedings of the 13th ACM International Systems and Storage Conference},
year = 2020,
month = may#
}