Light-Weight Remote Communication for High-Performance Cloud Networks

  • Subject:Cloud Computing
  • Type:Diploma Thesis
  • Date:23.05.2012
  • Supervisor:

    Prof. Dr. Frank Bellosa, Dr. Jan Stoess

  • Graduand:Jens Kehne
  • Links:PDF
  • Abstract:

    Over the last decade, customers increasingly tended to host their computing services in the cloud instead of buying and maintaining their own hardware in order to save costs. Due to cost- and power constraints, the data centers hosting these clouds are currently moving towards more balanced hardware designs than are currently in use. These new architectures employ "weaker" but more cost- and power-efficient CPU cores as well as interconnects capable of carrying out delegated work from the CPU independently by including features like transport layer offloading, user-level I/O and remote DMA.

    Most current cloud platforms are based on commodity operating systems such as Linux or Microsoft Windows. While these operating systems can likely be ported to future clouds' CPU architectures, their network stacks typically neglect support for offloading-, user-level I/O- or remote DMA-features that may be present in the network hardware. Especially the Berkeley socket interface used by most commodity operating systems is largely incompatible with these special features: Socket I/O is synchronous and stream-based, whereas most offloading mechanisms as well as remote DMA are asynchronous and message-based by nature.

    In this thesis, we introduce LibRIPC, a light-weight communication library for cloud applications. LibRIPC improves the performance of cloud networking significantly, yet without sacrificing flexibility. Instead of using sockets, LibRIPC provides a message-based interface, consisting of a short- and a long send operation. Short sends provide low latency at the cost of a limited message size, while long sends use the hardware's remote DMA features to provide high bandwidth. Both send functions implement zero-copy data transfer in order to reduce both latency and overhead. Hardware-specific details, like connection establishment or
    memory registration, are hidden behind library functions, providing applications with both ease of integration and portability. The library provides the flexibility needed by cloud applications by using a hardware- and location-agnostic addressing scheme to address communication endpoints. It resolves hardware specific addresses dynamically, thus supporting application restarts and migrations.

    We evaluate our approach by means of an InfiniBand-based prototype implementation and the Jetty web server, which we integrated into the Hadoop map/reduce framework. Our prototype leverages InfiniBand's hardware semantics to implement efficient data transfer, while compensating InfiniBand's drawbacks by caching the results of expensive operations. We were able to efficiently integrate our library into Jetty; however, Hadoop would have required extensive modifications to its code base in order to make full use of our library's zero-copy semantics. Nonetheless, our results indicate that our library can significantly improve network throughput, latency and overhead.

    BibTex:

    @diplomathesis{kehne12remotecommunication,
     author = {Jens Kehne},
     title = {Light-Weight Remote Communication for High-Performance Cloud Networks},
     type = {Diploma Thesis},
     address = {System Architecture Group, Karlsruhe Institute of Technology (KIT), Germany},
     month = may # "23",
     year = 2012,
     url = {http://os.ibds.kit.edu/}
     }