In the last few weeks at Collabora, I worked on having D-Bus directly in the Linux kernel to improve its performances. I based my work on what Ian Molton did. For now it’s just a prototype, but some benchmarks I did show a good performance improvement.
The cost of context switches
When an application sends a message on the bus to another application, the message is first sent to dbus-daemon through an Unix socket. The kernel copies the message to the receiving queue of dbus-daemon and dbus-daemon wakes up. Then dbus-daemon adds the sender field in the header of the message, and sends it to the recipients according to the destination field and the match rules (usually one recipient but there could be more for signals or in case of eavesdropping).
D-Bus message transmission with dbus-daemon
So dbus-daemon wakes up on every message, it costs a context switch and a memory copy.
Don’t wake up, dbus-daemon!
The idea of D-Bus in the kernel is to deliver the message directly to the right recipients without waking up any intermediary process. We added a new kind of socket, “AF_DBUS”, that behaves in a similar way to Unix sockets. Every application using DBus would need to use the new socket type, but that just means changing libdbus and the few other libraries for D-Bus.
The kdbus kernel module reads all the messages and check for the messages “Hello“, “NameAcquired“, “NameLost” and “AddMatch” to get to know the unique names, the well-known names and the match rules. Then it is able to deliver the messages only to the right applications and shortcut dbus-daemon. If there are several recipients, the message is still memcpied only one time thanks to skb_clone().
AF_DBUS does not wake up dbus-daemon unnecessarily
The prototype still uses dbus-daemon for authentication and D-Bus activation. The bus driver org.freedesktop.DBus is still implemented in dbus-daemon.
The first benchmark is dbus-ping-pong. It is a simple tool to call a D-Bus method and wait for the reply 10000 times. I tried it both on a KVM/i386 virtual machine and on a N900/arm.
- KVM/i386, without kdbus: 3.887s
- KVM/i386, with kdbus: 2.085s (x1.8)
- N900/arm, without kdbus: 28.00s
- N900/arm, with kdbus: 9.23s (x3)
I tried Adrien Bustany’s benchmark tool designed with Tracker in mind. My test is on KVM/i386 with #define CHUNK_SIZE 8 in order to have D-Bus messages of 6905 bytes (the kdbus prototype has a limitation at the moment: messages bigger than SKB_MAX_ALLOC or 8kB are still delivered to dbus-daemon so it does not bring better performances).
- KVM/i386, without kdbus: Fetched 300000 rows in 32874ms
- KVM/i386, with kdbus:Fetched 300000 rows in 24368ms (26% faster - x1.35)
I also tested how long does a N900 take to connect to Jabber and show my contacts’ presence on the desktop widgets. I measured the time manually but ran enough tests to be sure it is consistent.
- N900/arm, without kdbus: avg 11.87s
- N900/arm, with kdbus: avg 10.56s (x1.12)
Keep in mind this is only a proof-of-concept. It is not ready for merging and has limitations, including security ones. However, I managed to use kdbus for both the system and session bus on a N900 and, with just a few hacks, everything worked fine.