Internet is providing more and more bandwidth to transfer files and files are bigger and bigger. One thing is not changing, it is latency as distance still the same over the time. When a protocol requires acknowledgment between blocks of transfer this latency is limiting the throughput like explained in this post.
The throughput is really different depending on the protocol in use to transfer the file. As I did not found something giving a lot of data to compare the existing protocol, i’ll try to get figures myself and detail here.
As a source and a destination, I’m using Linux VM running under VirtualBox, running on the same machine. The network between the two is a virtual network, so the bandwidth is large. The source and destination directory will be tmpfs to have data in RAM and do not have any hard drive limitation. Operating system is OpenSuse 13.1.
Udp was not working correctly on this test environment, frankly speaking it was hard to get a working environment (take a look to post mortem). The test has been made between two physical box host in the cloud. In this environment the maxiam bandwith was lower : 11MB/s.
Source of the transfer is 10.0.2.5, the source directory is a 256M tmpfs mount on /root/tmpfs
# mkdir ~/tmpfs # mount -t tmpfs -o size=256M tmpfs ~/tmpfs
Destination of the transfer is 10.0.2.4, the destination directory is a 256MB tmpfs mount on /root/tmpfs created the same way.
A file of 80MB is created with random values, this file will be the one to transfer for the tests
10.0.2.5:~/tmpfs # dd if=/dev/urandom of=tfile bs=1M count=80
To configure latency, both source et destination are configure to add half of the RTD on the interface with 10% variance and 0.2% loss. The limit parameter defines the queue size, it will grow a lot with latency a high value is needed. Here is an example for 100ms rtd :
10.0.2.5:~ # tc qdisc add dev eth0 root netem limit 500000 delay 50ms 5ms loss 0.2%
10.0.2.4:~ # tc qdisc add dev eth0 root netem limit 500000 delay 50ms 5ms loss 0.2%
The next time, the change order have to be used instead of add :
10.0.2.x:~ # tc qdisc change dev eth0 root netem delay 100ms 10ms loss 0.2%
The latency can be checked with a ping command:
10.0.2.5:~ # ping 10.0.2.4
These results show the loss of bandwidth over latency as a drastic impact on transfer performance.
The only UDP tested protocol is not really visible due to the problem encountered during the test (see post mortem) by the way, a decrease of performance is also important.
In my point of view the best solution for it’s performance and its use to implement is the HTTP multithread transfer, with 8MB/s (64Mbits/s) it is actually covering most of the bandwidth we can access on a such distance.
The second graph show the decrease on a base 100 :
Here we an see the decrease of performance of each protocol. I take the decision to start it at 10ms because the difference of certain protocol at 0ms.
The UDP protocol show a better result than the other which make sense even if I had expected to see something higher.
I start this test after getting information about commercial solution like Aspera/FASP protocol which communicate on really better performance than what is shown here. They are announcing about 505Mb/s (63MB/s) over 200ms latency as an example, which is a decrease of 0.8% compared to 10ms latency.
Actually I’m a little bit disappointed about the open-source offer in this area for fast transfer protocol over big latency. Actually we are having more and more challenges to transfer large files over long distance (like VM over different cloud operators) and the access to large bandwidth is becoming possible. I really think the open-source community should start developing efficient protocol to do it ; eventually make something workable and optimized from the existing protocol (uftp/udt).
By-the-way, regarding the difficulties I had to execute test in the cloud and virtualized environment I also assume we may take a look on udp on kvm to fix it. Then I assume that using UDP protocol, even proprietaries, is facing off all the security equipment installed in the cloud to protect from DDoS and others. Looking that point an open-source standard should appear to be taken into account in the filtering rules.
10.0.2.5:~/tmpfs # scp tfile email@example.com:~/tmpfs
Results over latency
- 0.5 ms – 40MB/s – 0’02”
- 10 ms – 26,7MB/s – 0’03”
- 50 ms – 6,7MB/s – 0’12”
- 100 ms – 4,7MB/s – 0’17”
- 200 ms – 4,4MB/s – 0’18”
- 300 ms – 3,5MB/s – 0’23”
Rsync over ssh transfer
10.0.2.5:~/tmpfs # rsync -avz ./tfile -e ssh firstname.lastname@example.org:~/tmpfs
Results over latency
- 0.5 ms – 11,2MB/s – 0’4”
- 10 ms – 9,8MB/s – 0’5”
- 50 ms – 4,3MB/s – 0’18”
- 100 ms – 4,3MB/s – 0’18”
- 200 ms – 2,5MB/s – 0’30”
- 300 ms – 3,1MB/s – 0’27”
I use thttp server for the test, installing it on the source. Then I configure it to serve file in the right directory.
10.0.2.5:~ # vi /etc/thttpd.conf #www root directory (-d) dir=/root/tmpfs 10.0.2.5:~ # /etc/rc.d/thttpd restart
The download is started using wget on the target
10.0.2.4:~/tmpfs # wget 10.0.2.5/tfile
Result over latency
- 0.5 ms – 63MB/s – 0’1”
- 10 ms – 29MB/s – 0’3”
- 50 ms – 16.1MB/s – 0’5”
- 100 ms – 5,8MB/s – 0’14”
- 200 ms – 2,4MB/s – 0’33”
- 300 ms – 1,9MB/s – 0’42”
With Http, we are able to use parallelism easily with project like Axel.
The command to execute is :
10.0.2.4:~/tmpfs # axel -a --num-connections=20 http://10.0.2.5/tfile
Result over latency
- 0.5ms – 63MB/s – 0’1”
- 10 ms – 23MB/s – 0’3”
- 50 ms – 20MB/s – 0’4”
- 100 ms – 13,4MB/s – 0’5”
- 200 ms – 10MB/s – 0’8”
- 300 ms – 8.1MB/s – 0’9”
Note – as it is possible to add thread regarding the latency, 300ms with 40 threads can achieve 4MB/s and 0’20”
Uftp is a kind of UDP FTP protocol, you have a deamon to be use to receive files and a client to push files.
The command are the following :
10.0.2.4:~/tmpfs # uftpd -t -D /root/tmpfs
10.0.2.5:~/tmpfs # uftp -C tfmcc -R 80000 -H 10.0.2.4 tfile
Results over latency
- 0.5ms – 1.4MB/s – 0’58”
- 10ms – 0,7MB/s – 1’53”
- 50ms – 0,6MB/s – 2’02”
- 100ms – 0,6MB/s – 2’16”
- 200ms – 0,5MB/s – 2’53”
- 300ms – 0,3KB/s – 4’05”
UDT is another project to transfer files over UDP. The project is more a framework to develop an application using this protocol. By-the-way, you have client application to send / receive files.
Command are the following :
10.0.2.4:~/udt4/app# export LD_LIBRARY_PATH=../src ;
Result over latency …
- 300 ms – 92KB/s – 15’15”
pssh is a python parallel ssh tool allowing to have multiple thread to transfer the files. In fact, this tool is used to copy files or execute command on multiple target in parallel but is not parallelizing file transfer to a specific target. The results are like ssh / rsync.
Initially I did the same test based on VirtualBox. As a result, the TCP protocols had approximatively the same performance but UDP one got really bad result, even with no latency. As no drop has been seen at the Vm level I assume they are at the Virtual box layer.
Next, I did a try on KVM were I was able to communicate in TCP mode between VM after configuring macvtap as bridge. I never got UDP working between the machine.
Next I did a try between two EC2 VM were I got the same situation.
Next I did a try between two physical machines host in OVH were I got the same situation.
Finally I did the test between two physical machines host on my network.
Sound like really complex to get solution based on UDP working in an environment.
This test is not considering all the protocols, I did not test FTP as an example as I consider it as deprecated over scp. I would had test NFS for fun … By the way, my time was boxed and UFTP tests taken me too much of that time.
I’m really disappointed about uFTP result, so if anyone have better results or if UFTP team want to take a look to the issues I had I’ll be happy to take a look and support.
Feel free to add comments, results and protocol proposal to complete this post.
If you need some assistance tuning UFTP, let me know and I’ll take a look at the logs and see what we can do there.
Thank you for your proposal. I’m not working on it actually, if needed I won’t hesitate
rsync has many benefits in syncing files that already exist on both the client and the remote. Rsync can skip whole files if they’re unchanged and can only move blocks that have changed if they have.
But how do you overcome latency issue with rsync when latency is >200ms?