ITEC -  Operating Systems Group

Techniques for building a fast threads package on NUMA architectures

  • Author:

    Frank Bellosa

  • Source:

    University of Erlangen, Technical Report TR-I4-94-06, February 1994

  • Date: 02.1994
  • Abstract:

    Operating system abstractions do not always reach high enough for direct use by a language or applications designer. The gap is filled by application-specific runtime environments. Typical arguments for their use include complete user-level control over threads scheduling and possibilities regarding the customization of threads synchronization or communications constructs. Especially on NUMA architectures an interface between scheduler and application is essential to overlap computation and memory transfer.

    We think about a nonpreemptive user-space threads package with an application interface. The application should be able to get information about scheduling decisions of the runtime system to invoke prefetch operations. Furthermore efficient machine dependent code for creating, running and stopping threads has to be provided by the runtime system. By separating the notion of execution (starting and stopping threads) from threads allocation and scheduling, changing scheduling policies can be as simple as using different function pointers and can be done efficiently at runtime. Thus details of the threads package are not fixed, but can instead be tuned to the needs of the application. To implement this package we want to follow a two level approach: The lower level consists of assembler code for fast thread initialization and context switching. The upper level is a toolbox for building application specific schedulers and synchronization operations. The kernel threads provided by the operating system represent the “virtual processors” of the runtime system. This kind of threads package can only work efficiently, if we use gang-scheduled kernel threads in a multiuser environment or individual-scheduled kernel threads in an environment with just one running application on each processor set.

    A fast threads package on NUMA architectures is the prerequisite for an easy implementation of adaptive numerical methods on unstructured grids. A first approach for an implementation is given in the next section.

    Last but not least, a fast threads package can be the support library for a compiler doing automatic parallelization.


      author = {Frank Bellosa},
      title = {Techniques for Building a Fast Threads Package on NUMA Architectures},
      booktitle = {Technical Report},
      number = {TR-I4-94-06},
      month = feb,
      year = 1994,
      affiliation = {University of Erlangen, Germany},
      url = {}