As part of an investigation into several other things, my Internet research dredged up some work by Jim Gettys on Bufferbloat, which was related closely enough to my intended target to get stuck into it a little more.
It has been previously been believed that adding buffers across a network to help mitigate against packet loss is a “good thing™”, especially if we’re going to be dealing with VoIP and other protocols sensitive to packet-loss. Bufferbloat is the phenomenon caused by these buffers when the data source(s) can generate enough data to fill these (often deep) buffers, and thereby destroy the linkage between TCP’s congestion control mechanism and the actual data rate.
This leads to added latency, increased packet loss, lowered network efficiency (all those retransmitted packets take up space that should be used for other data), and lower throughput. These are all situations that we have experienced in the Internet, every day, and is typically shown by the sporadic performance in downloading large files across the Internet.
There is a nice introduction to bufferbloat on their page, which explains the problem, which is basically that each and every buffer within a network is a potential problem, as if data can be sent into the network faster than the buffer can empty, this will have a knock on impact on the performance of the flow (of whatever application or protocol). Whilst this does optimise the sue of bandwidth in the network, (i.e. it keeps it full), it doesn’t optimise the performance of the flows, and this means that the network efficiency drops as it’s carrying many packets that are retransmissions of the original (still buffered) data, because it hasn’t yet arrived.
The solution in the past has been to implement technologies such as random early discard (RED) in order to determine if the buffer is filling, and to selectively discard packets on flows that occupy more of the buffer than others. But it hasn’t always fixed the problem.
Whilst the Bufferbloat project is looking at ways to minimise buffer size (and potentially self tune the buffers within devices), the situation exists in each and every network, because buffers are cheap (both in parts and logic) to implement. But it may take a long time until router and switch (and all the other bits of comms kit) suppliers fix their environments to adjust automatically. The internet might be a challenge for a while, but it’s certainly possible to fix the enterprise environment.
So how do you manage the performance of a network if the feedback mechanisms are broken? I believe that the best mechanism is the management of the performance of each flow against the available bandwidth at the edge of the network. How can we implement one of these?
We need:
- A system with intelligence at the edge
- A way of understanding the available bandwidth in use at each point
- A mechanism to ensure that packet loss is managed to protocols that are less ‘important’
- A mechanism to determine if there is bufferbloat in the network core
It sounds like just the sort of things that the Ipanema Technologies‘ Autonomic Networking System delivers.