Besides CPU time, memory is one the most central resources in a computing device. It is thus not by chance that memory management is a critical task in operating systems. With the increasing diversity in memory types that, the convergence of volatile and non-volatile memory technologies, and sophisticated mapping and caching mechanisms on various levels of the memory hierarchy, this applies today more than ever.
Non-volatile memory (NVM) is an emerging type of memory that stores data persistently but at the same time is byte-addressable and fast enough so that it can directly connected to the memory controller. We are researching efficient use of NVM in applications as an extension of main memory.
When an OS allocates memory to a process, it implicitly performs long-term scheduling on DRAM resources such as channels and banks. The OS should be able to choose between sharing or dedicating resources dynamically – yet it cannot do that on conventional systems. With DRAM mapping aliases, we enable the OS to choose between channel interleaving and partitioning at run-time, at the granularity of address space (AS) segments.
We present GPUswap, a novel approach to enabling oversubscription of GPU memory that does not rely on software scheduling of GPU kernels. GPUswap uses the GPU’s ability to access system RAM directly to extend the GPU’s own memory. In contrast to software scheduling, where applications suffer from permanent overhead even with sufficient GPU memory available, our approach executes GPU applications with native performance.
Limited memory sizes and latencies have become a primary bottleneck. Prior work has shown great opportunity for memory deduplication. There may be plenty of redundant data between virtual machines, e.g., if similar operating systems or applications are used. Memory can be freed by collapsing redundant pages to a single page and sharing it in a copy-on-write fashion.