Thursday, 24 February 2011

D-Bus traffic pattern with Telepathy

Telepathy is a big user of D-Bus, both on the GNOME desktop with Empathy and on the N900. When I have a lot of Telepathy accounts and contacts, the start-up time could be about 10 seconds. How much is the D-Bus communication responsible for the start-up time? When I tried the D-Bus in the kernel prototype on the N900, I win 1.3 seconds. It shows it is not negligible and there is room for improvements. But the slowness may not be due only to D-Bus itself but the way it is used. It is interesting to look at the D-Bus traffic pattern. This topic was already largely covered by Will’s  talk on Profiling and Optimizing D-Bus APIs but I want to give a preview on future D-Bus analysis tools. The existing D-Bus tools dbus-monitor, D-Feet and Bustle are still very useful of course.

A glance at D-Bus messages

The easiest to begin with is to record all the D-Bus traffic with dbus-monitor while I start Empathy, send a few messages, and quit. Let’s see if there is any surprise. dbus-monitor generated a 13MB text file containing 8313 D-Bus messages:

  • 2850 method calls
  • 761 methods returns
  • 4699 signals
  • 3 errors

The number of method calls and method returns are quite different! This is because dbus-monitor doesn’t record method returns from the bus driver (org.freedesktop.DBus, implemented by dbus-daemon). Indeed, my log file contains 2066 method calls to the bus driver:

Method calls for D-Bus match rules (AddMatch/RemoveMatch) generate a lot of traffic. The test was done with the upstream dbus-glib 0.88-2.1, without the patch on DBusGProxy (Bug #23846) to have per-member match rules. If that patch was applied, the number of D-Bus match rules could be even bigger.

If the D-Bus match rules were implemented with socket filters on multicast Unix sockets, AddMatch and RemoveMatch messages would be completely replaced by the system call setsockopt(), reducing the number of context switches from D-Bus peers to dbus-daemon. But the situation could be improved with the current dbus-daemon implementation: 628 AddMatch are match rules on NameOwnerChanged (48%), generated by dbus-glib, but there is actually only 25 unique match rules. The duplicate match rules are caused by Bug #33646 on DBusGProxy.

Performance depends on the size of message

Now, let’s have a look on the size of messages and the impact on performances. I can use dbus-ping-pong to estimate how long it takes to send a D-Bus message with a specific size. The following script starts the ping-pong messages between two D-Bus peers, with several message sizes, and I gathered the results in the next graph:

#!/bin/bash

./dbus-ping-pong server org.Echos &
PID=$!
echo "Server is PID=$PID"
sleep 1

for i in 1 3 10 31 100 316 1000 3162 10000 31622 100000 316227 1000000 3162277 10000000 ; do
  echo $i
  ./dbus-ping-pong client org.Echos 100 $i
done

kill $PID

Performance of dbus-ping-pong with various message sizes

Figure 1

I ran the test twice with different values of max_bytes_read_per_iteration, the size of the buffer used to read on the socket; the first time with 2048 bytes and the second time with 65536. When the read buffer is 32 times bigger, there is less recvmsg() system calls for big incoming D-Bus messages and it makes the test 28% faster.

Usual message sizes in Telepathy

Let’s see what are the usual sizes of messages in the context of Telepathy. I just run bustle-dbus-monitor with a patch to get the size of each message (Bug #34067).



Figure 2

Most messages are between 101 bytes and 316 bytes (in orange on the graph).

With piecewise linear interpolation (spreadsheet source) based on the results of dbus-ping-pong, I can approximate how long takes the transmission of each individual message, and then what size of messages are the bottleneck and would be worth optimizing.

It may not be a good estimation because there is more computation in the real Telepathy scenario than in the synthetic dbus-ping-pong scenario: the number of match rules in dbus-daemon is different, there is more connections, and the Telepathy processes are not only sending D-Bus messages so the CPU caches are not behaving the same way. The real time spent in D-Bus communication should be larger than what is computed here with piecewise linear interpolation. But it can still give an idea.



Figure 3

For messages with a small size (under 3000 bytes), the two histograms (figure 2 and figure 3) have the same shape because the ping-pong time is almost constant for those messages. For the larger messages, they take a bit more time than their distribution because the ping-ping time is getting bigger on those messages.

Messages between 101 bytes and 316 bytes are the ones causing the most delay in Telepathy start-up.

The Stats interface

We can look at those messages in more details with the new Stats interface in dbus-daemon. The script messages_stats.py creates a csv file from the Stats interface containing the peak size of messages, the peak number of recipients and number of each D-Bus method call and signal. With that data, I can check that no messages from Telepathy are received too broadly, and if they are, I could use the GetConnectionMatchRules patch to find the application which subscribes to a too broad match rule.

Most of the D-Bus messages generated by Telepathy have a reasonable number of recipients. Only 2 different D-Bus signals have more than 3 recipients, and it is only a few messages in total:

  • org.freedesktop.Telepathy.Connection.Interface.Aliasing.AliasesChanged (4 recipients)
  • org.freedesktop.Telepathy.Connection.Interface.SimplePresence.PresencesChanged (4 recipients)

Repartition of messages per number of recipients

Figure 4

The most sent D-Bus messages by Telepathy are:

  • org.freedesktop.Telepathy.Connection.Interface.Contacts.GetContactAttributes (17 calls)
  • org.freedesktop.Telepathy.Account.AccountPropertyChanged (11 signals)

I don’t see anything obviously bad here. Telepathy is a good D-Bus citizen!

No comments:

Post a Comment