I recently attended this:
Dropbox: International Performance
Come hear about several recent and future improvements to Dropbox’s international performance. Dropbox Engineer, Nipunn Koorapati, will talk about the Dropbox server architecture as well as recent optimizations to the client-server protocol for small and large files focused on our high latency international users.
We face international performance challenges at Offroad too, with systems in Switzerland and the US. We have fairly high bandwidth between the sites, but the high round-trip times (latency) leads to problems. Moving individual large files tends to run fast, but moving folders containing many small files takes forever. The following is a bit of an oversimplification, but it could be because of the way new TCP connections start out. New connections can’t just begin by sending data at full speed, since there may be transmission errors on an unreliable channel leading to nothing being successfully received. New connections start out with small chunks and increase the size as the channel proves to be reliable. With trans-Atlantic latency of around 100 ms, it takes a while for a connection to ramp up. In fact, our bandwidth-delay product is around 2 Mb and Windows 7 starts out with 65 kb window sizes. That means it probably takes about 5 round-trips or 1 second for the size to scale up to full speed (128 kb, 256 kb, 512 kb, 1 Mb, 2 Mb). That would lead to terrible performance if each small file opens a new connection. That may not be exactly what’s happening to us, but the cause is certainly related to high latency combined with several back-and-forth transmissions per file.
Nipunn revealed that Dropbox’s international users face similar issues since Dropbox only has servers in the US (I think only California). Dropbox can’t change the TCP window sizes on client machines. If clients are on unreliable connections, they may automatically tune to low window sizes, requiring acknowledgement after only small amounts of data are transmitted. So Dropbox is going to solve the problem by having relay servers around the world, who communicate with clients with short latency and then forward the data over a pool of TCP connections to California that remain open and tuned to full speed.