Automatic Core Specialization for AVX-512 Applications
- 
                    Author:
                    Mathias Gottschlag, Peter Brantsch, Frank Bellosa 
- 
                    Source:
                    SYSTOR 2020, 13th ACM International Systems and Storage Conference, Haifa, Israel, October 13-15, 2020 
- 
                    Abstract: Advanced Vector Extension (AVX) instructions operate on wide SIMD vectors. Due to the resulting high power consumption, recent Intel processors reduce their frequency when executing complex AVX2 and AVX-512 instructions.Following non-AVX code is slowed down by this frequency reduction in two situations: When it executes on the sibling hyperthread of the same core in parallel or – as restoring the non-AVX frequency is delayed – when it directly follows the AVX2/AVX-512 code. As a result, heterogeneous workloads consisting of AVX-512 and non-AVX code are frequently slowed down by 10% on average. 
 In this work, we describe a method to mitigate the frequency reduction slowdown for workloads involving AVX-512 instructions in both situations. Our approach employs core specialization and partitions the CPU cores into AVX-512 cores and non-AVX-512 cores, and only the former execute AVX-512 instructions so that the impact of potential frequency reductions is limited to those cores. To migrate threads to AVX-512 cores, we configure the non-AVX-512 cores to raise an exception when executing AVX-512 instructions. We use a heuristic to determine when to migrate threads back to non-AVX-512 cores. Our approach is able to reduce the frequency reduction overhead by 70% for an assortment of common benchmarks.Bibtex: @inproceedings{gottschlag20AVX-512, 
 author = {Gottschlag, Mathias, Brantsch, Peter and Bellosa, Frank},
 title = {Automatic Core Specialization for AVX-512 Applications},
 booktitle = {Proceedings of the 13th ACM International Systems and Storage Conference},
 year = 2020,
 month = may#
 }
