Fast Persistent Memory Crash Consistency Analysis based on Virtual Machines

  • Type:Bachelor Thesis
  • Date:03.11.2023
  • Supervisor:

    Prof. Dr. Frank Bellosa
    Lukas Werling

  • Graduand:Thomas-Christian Oder
  • Links:PDF
  • Abstract
    Non-volatile memory (NVM) is a new technology that is directly integrated in the processor’s memory system. While NVM has a higher latency and is slower than DRAM, its performance is far superior when compared to regular SSDs. NVM can be used as a DRAM replacement as well as regular storage. When used as regular storage, it can be accessed by a Persistent Memory (PM) compatible application, or be abstracted into a regular storage device with a generic standard interface, like POSIX, using a file system. New file systems like Nova and PMFS were developed to fully profit from NVM’s design. Due to this difference in design, existing file system testers to test for crash consistency, like CrashMonkey or Hydra, are not able to test those file systems and new solutions, like Vinter, had to be developed.
    Vinter is a record-and-replay black-box approach for testing PM file systems using manually written tests. While it provides a quick test heuristic already, Mumak, a blackbox system to analyze performance and crash consistency on PM applications, presented a different approach that promises similar results at a better runtime performance when applied to Vinter. Mumak uses a trace entry’s kernel stack trace to generate a failure point tree. This tree is used deduplicate trace entries by their kernel stack trace and to generate possible crash images later on. Mumak extends this process with a pattern based trace analyzer to discover potential bugs in performance and design of the tested application.
    In this thesis, we extend Vinter’s crash image generator by this new approach to generate crash images using the failure point tree.
    We also extend Vinter with an improved and standalone version of Mumak’s trace analyzer to provide an additional way to find bugs.
    We could verify that the failure point tree approach delivers a big improvement in runtime while delivering similar, although not always fully equal, results when running our existing tests against Nova, Nova-Protection and PMFS.
    We were further able to show that the trace analyzer is able to find some of the bugs missed by the failure point tree approach as well as additional bugs that were not previously covered at all.

    BibTex:

    @bacholorthesis{oder23virtualmachines,
      author = {Thomas-Christian Oder},
      title = {Fast Persistent Memory Crash
    Consistency Analysis based on
    Virtual Machines},
      type = {Bachelor Thesis},
      year = 2023,
      month = nov # "03",
      school = {Operating Systems Group, Karlsruhe Institute of Technology (KIT), Germany}
      }