Automatic Core Specialization for AVX-512 Applications

  • Author:

    Mathias Gottschlag, Peter Brantsch, Frank Bellosa

  • Source:

    SYSTOR 2020, 13th ACM International Systems and Storage Conference, Haifa, Israel, October 13-15, 2020

  • Abstract:

    Advanced Vector Extension (AVX) instructions operate on wide SIMD vectors. Due to the resulting high power consumption, recent Intel processors reduce their frequency when executing complex AVX2 and AVX-512 instructions.Following non-AVX code is slowed down by this frequency reduction in two situations: When it executes on the sibling hyperthread of the same core in parallel or – as restoring the non-AVX frequency is delayed – when it directly follows the AVX2/AVX-512 code. As a result, heterogeneous workloads consisting of AVX-512 and non-AVX code are frequently slowed down by 10% on average.
    In this work, we describe a method to mitigate the frequency reduction slowdown for workloads involving AVX-512 instructions in both situations. Our approach employs core specialization and partitions the CPU cores into AVX-512 cores and non-AVX-512 cores, and only the former execute AVX-512 instructions so that the impact of potential frequency reductions is limited to those cores. To migrate threads to AVX-512 cores, we configure the non-AVX-512 cores to raise an exception when executing AVX-512 instructions. We use a heuristic to determine when to migrate threads back to non-AVX-512 cores. Our approach is able to reduce the frequency reduction overhead by 70% for an assortment of common benchmarks.

    Bibtex:

    @inproceedings{gottschlag20AVX-512,
      author = {Gottschlag, Mathias, Brantsch, Peter and Bellosa, Frank},
      title = {Automatic Core Specialization for AVX-512 Applications},
      booktitle = {Proceedings of the 13th ACM International Systems and Storage Conference},
      year = 2020,
      month = may#
    }