This repo contains an implementation of the Homa transport protocol as a Linux kernel module.
For more information on Homa in general, see the Homa Wiki.
More information about this implementation and its performance are available in the paper A Linux Kernel Implementation of the Homa Transport Protocol, which appeared in the USENIX Annual Technical Conference in July, 2021.
A synopsis of the protocol implemented by this module is available in protocol.md.
As of August 2020, Homa has complete functionality for running real applications, and its tail latency is more than 10x better than TCP for all workloads I have measured (Homa's 99-th percentile latency is usually better than TCP's mean latency). Here is a list of the most significant functionality that is still missing:
Please contact me if you have any problems using this repo; I'm happy to provide advice and support.
The head is known to work under Linux 6.13.9. In the past, Homa has run under several earlier versions of Linux. There is a separate branch for each of these older versions, with names such as linux_4.15.18. Older branches are out of date feature-wise: recent commits have not been back-ported to them. Other versions of Linux have not been tested and may require code changes (these upgrades rarely long). If you get Homa working on some other version, please submit a pull request with the required code changes.
Related work that you may find useful:
To build the module, type make all
; then type sudo insmod homa.ko
to install it, and sudo rmmod homa
to remove an installed module. In practice, though, you'll probably want to do several other things as part of installing Homa. I have created a Python script that I use for installing Homa on clusters managed by the CloudLab project; it's in cloudlab/bin/config
. I normally invoke it with no parameters to install and configure Homa on the current machine.
The script cloudlab/bin/install_homa
will copy relevant Homa files across a cluster of machines and configure Homa on each node. It assumes that nodes have names nodeN
where N is a small integer, and it also assumes that you have already run make
both in the top-level directory and in util
.
For best Homa performance, you should also make the following configuration changes:
sysctl
to configure Homa's use of priorities (e.g., if you want it to use fewer than 8 levels). See the man page homa.7
for more info.NIC support for TSO: Homa can use TCP Segmentation Offload (TSO) in order to send large messages more efficiently. To do this, it uses a header format that matches TCP's headers closely enough to take advantage of TSO support in NICs. It is not clear that this approach will work with all NICs, but the following NICs are known to work:
There have been reports of problems with the following NICs (these have not yet been explored thoroughly enough to know whether the problems are insurmountable):
Please let me know if you find other NICs that work (or NICs that don't work). If the NIC doesn't support TSO for Homa, then you can request that Homa perform segmentation in software by setting the gso_force_software
parameter to a nonzero value using sysctl
. Unfortunately, software segmentation is inefficient because it has to copy the packet data. Alternatively, you can ensure that the max_gso_size
parameter is the same as the maximum packet size, which eliminates GSO in any form. This is also inefficient because it requires more packets to traverse the Linux networking stack.
A collection of man pages is available in the "man" subdirectory. The API for Homa is different from TCP sockets.
The subdirectory "test" contains unit tests, which you can run by typing "make" in that subdirectory.
The subdirectory "util" contains an assortment of utility programs that you may find useful in exercising and benchmarking Homa. Compile them by typing make
in that subdirectory. Here are some examples of benchmarks you might find useful:
cp_node
program can be run stand-alone on clients and servers to run simple benchmarks. For a simple latency test, run cp_node server
on node1 of the cluster, then run cp_node client
on node 0. The client will send continuous back-to-back short requests to the server and output timing information. Or, run cp_node client --workload 500000
on the client: this will send continuous 500 KB messages for a simple througput test. Type cp_node --help
to learn about other ways you can use this program.cp_vs_tcp
script uses cp_node
to run cluster-wide tests comparing Homa with TCP (and/or DCTCP); it was used to generate the data for Figures 3 and 4 in the Homa ATC paper. Here is an example command:
cp_vs_tcp -n 10 -w w4 -b 20
When invoked on node0, this will run a benchmark using the W4 workload from the ATC paper, running on 10 nodes and generating 20 Gbps of offered load (80% network load on a 25 Gbps network). Type cp_vs_tcp --help
for information on all available options.cp_
scripts can be used for different benchmarks. See util/README.md
for more information.Some additional tools you might find useful:
/proc/net/homa_metrics
. The script util/metrics.py
will collect metrics and print out all the numbers that have changed since its last run.homa.7
.RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4