next up previous contents index
Next: P-Grid network usage Up: Gnutella network usage Previous: Routing errors   Contents   Index

Summary and conclusions

To summarize what has been obtained during the three test sessions a number of tables are presented in this section. Each table contains a summary of a certain aspect of the statistics. Calculating the average amount of network consumption is the most important part, but several other aspects are also considered. A number of conclusions are also drawn from these summaries.

The average amount of incoming traffic for each of the three sessions is presented in table tbl:limeavgincoming and the corresponding data for outgoing traffic is shown in table tbl:limeavgoutgoing.


Table 6.6: Gnutella session incoming bytes/minute
Session Avg Q0 Q1 Q2 Q3 Q4
First 60609 25947 48526 53304 59316 201019
Second 51928 32266 37748 43527 53797 152548
Third 35666 12910 24907 28687 35428 163427
Total 48754 12910 32473 41384 53931 201019




Table 6.7: Gnutella sessions outgoing bytes/minute
Session Avg Q0 Q1 Q2 Q3 Q4
First 4179 1647 2885 3871 5215 7503
Second 6975 1939 4874 6097 7564 28197
Third 4120 1062 3052 3974 5332 9784
Total 5040 1062 3436 4365 6093 28197



It is interesting to note that the bandwidth used by incoming messages is decreasing for each session conducted. The difference is due to a large decrease in incoming Pong messages. The difference between the first two does not appear to so large, but that is because during the second session a lot more Query messages were forwarded to us than during the other two sessions. This makes the difference between sessions two and three even larger.

Another very interesting aspect of the Limewire Gnutella client is that the routing of Pong messages appears to be flawed. During session 1, 41937 of 41956 incoming Pong messages were considered erroneous. For sessions two and three these numbers are 33346 of 33375 and 26912 of 27048 respectively. In all cases 99.5% or more of all incoming Pong messages were discarded. If this thing happens all over the Gnutella network it will result in a large waste of bandwidth.

The large number of bytes sent during session 2 is due to the larger number of incoming queries during that session. These queries led to the transmission of about twice as many bytes in QueryHit messages during session 2 than in the other sessions.

If all nodes of the Gnutella network are considered to have a similar amount of traffic it is easy to make some calculations on how much network bandwidth is actually used by the Gnutella protocol (not including any bandwidth used for downloading files). This assumption should not overestimate the total amount of traffic since ultrapeers use more bandwidth than leaves of the network. Also assume that the third session, which consumed the least bandwidth, is reasonably representative. The total bandwidth used per minute for each Gnutella host can then be calculated from the following:

H = 500 * TCP/IP headers/minute
h = 40 bytes/header
i = 35000 incoming bytes/minute
o = 4000 outgoing bytes/minute

According to [LIM03a], which maintains statistics for the Gnutella network, a reasonable approximation on the number of hosts connected to the network at any given time appears to be n = 95000 ± 15000. If this number is reasonably accurate the total network usage each minute by the Gnutella protocol is approximately:


T = n * (h * H + (i + o))

With real numbers this becomes quite a lot, but to see what it could mean in scalability terms lets extend the result to a 24 hour period instead of just one minute. It then becomes:

T = 95000 * (40 * 500 + (35000 + 4000)) * 60 * 24
T = 8 * 1012 bytes/day

Even if all TCP/IP headers are discarded it still accumulates to 78#78 bytes/day. In reality the number of such headers could be less than 500 if several Gnutella messages are sent in a single packet, but this depends on the frequency of transmissions, the network library implementation and its configuration in the operating system. Nevertheless, the order of magnitude remains at a couple of terabytes/day of network bandwidth being used by the Gnutella protocol.

The problem is fairly evident. There are still, even after the introduction of ultrapeers and caching schemes, problems with the Gnutella protocol. It consumes lots of bandwidth just for maintaining its structure and relaying queries. There is also the possibility that messages cross a single network link several times as it is relayed between peers, as described in [RIF02]. It is also worth noting that the headers used by TCP/IP consume a lot of bandwidth since many small messages are transmitted, making the overhead rather large. A good protocol should avoid polling the connection with small messages as much as possible and instead use more meaningful operations to, besides their normal purpose, maintain connectivity.


next up previous contents index
Next: P-Grid network usage Up: Gnutella network usage Previous: Routing errors   Contents   Index
Marcus Bergner 2003-06-10