scp tuning

I twitted yesterday :

laurentsch
copying 1TB over ssh sucks. How do you fastcopy in Unix without installing Software and without root privilege?

I got plenty of expert answers. I have not gone to far in recompile ssh and I did not try plain ftp.

Ok, let’s try first to transfer 10 files of 100M from srv001 to srv002 with scp :

time scp 100M* srv002:
100M1    100%   95MB   4.5MB/s   00:21
100M10   100%   95MB   6.4MB/s   00:15
100M2    100%   95MB   6.0MB/s   00:16
100M3    100%   95MB   4.2MB/s   00:23
100M4    100%   95MB   3.4MB/s   00:28
100M5    100%   95MB   4.2MB/s   00:23
100M6    100%   95MB   6.4MB/s   00:15
100M7    100%   95MB   6.8MB/s   00:14
100M8    100%   95MB   6.8MB/s   00:14
100M9    100%   95MB   6.4MB/s   00:15

real    3m4.50s
user    0m27.07s
sys     0m21.56s

more than 3 minutes for 1G.

I got hints about the buffer size, about SFTP, about the cipher algorythm, and about parallelizing. I did not install new software and I have a pretty old openssh client (3.8). Thanks to all my contributors tmuth, Ik_zelf, TanelPoder, fritshoogland, jcnars, aejes, surachart, syd_oracle and the ones the will answer after the writting of this blog post…

Ok, let’s try a faster algorythm, with sftp (instead of scp), a higher buffer and in parallel

$ cat batch.ksh
echo "progress\nput 100M1" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M2" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M3" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M4" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M5" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M6" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M7" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M8" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M9" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M10" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
wait
$ time batch.ksh
real    0m19.07s
user    0m12.08s
sys     0m5.86s

This is a 1000% speed enhancement :-)

17 thoughts on “scp tuning”

  1. Laurent

    Nice post, but in the first example you do the transfers serially, however in the sftp method you run multiple processes in the background.

    Did you try running the scp in the background in parallel too (I’d be interested in seeing the timings)?

    Thanks

    John.

  2. Hi Nicolas,

    This may help on slow network, but not on fast network with high latency

    According to man ssh,
    Compression is desirable on modem lines and other slow
    connections, but will only slow down things on fast
    networks

    $ time (tar cvf - 100*|gzip|ssh srv002 "gzip -d -c|tar xvf -")
    real    3m49.61s
    user    0m21.63s
    sys     0m6.67s
    

  3. I though scp was a wrapper around sftp. So what would cause scp to be so much slower?

  4. In Aix, scp accepts a “-C” option that instructs ssh to use compressed data transmission. I think if you can use that and parallel, you’ll get similar results.

  5. 1. scp is not a wrapper about sftp. They are separate protocols that use SSH as underlying security method. Specifically sftp is also a file system protocol that allows remote directory listing, while scp is not. Generally speaking scp is known to be faster.

    2. In this case, I suspect the main trick (other than 10 channels which gave most of the benefit) was the use of -B to give more bandwidth to the transfer. You don’t have this option (at least not easily) in scp.

  6. @chen I started the test this morning with a huge file… but I know you are impatient to get an answer so I tried again with an export dump that is a 1.66Gb in size. SFTP is clearly faster.

    With no option at all, the simple possible test, with 2 different dumps files, both of 1.66G

    
    time sftp oracle@srv004ax:/u02/oradata/201001110940/aaa1.dmp
    
    real    1m30.56s
    user    0m40.82s
    sys     0m10.76s
    
    time scp oracle@srv004ax:/u02/oradata/201001110940/aaa2.dmp .
    
    real    3m42.98s
    user    0m45.85s
    sys     0m30.66s
    
    $ ls -lrt
    -rw-r-----    1 lsc dba      1781233135 May 19 15:10 aaa1.dmp
    -rw-r-----    1 lsc dba      1780086038 May 19 15:14 aaa2.dmp
    

    the files are different, the second a few bytes smaller, loaded after, and sftp is faster.

    Generally speaking scp is known to be faster
    Sometimes

    
    $ time scp localhost:/etc/hosts xxx
    hosts                                         100% 2483     2.4KB/s   00:00
    
    real    0m0.32s
    user    0m0.02s
    sys     0m0.01s
    
    $ time sftp localhost
    Connecting to localhost...
    sftp> cd /etc
    sftp> get hosts xxx
    Fetching /etc/hosts to xxx
    /etc/hosts                                    100% 2483     2.4KB/s   00:00
    sftp> bye
    
    real    0m32.18s
    user    0m0.01s
    sys     0m0.00s
    

    But this is because I am a slow typer :)

  7. Hi

    I tried your method but i did not get any performance improvement. Below is the timing to transfer 500MB. I am using HPUX 11.23. Any suggestions and guidance to improve file transfer?

    real 2m22.40s
    user 0m53.66s
    sys 1m11.24s

    Thank you.
    -haris

  8. You should check what kind of bandwith your network offer. If you have a single 10Mb/s or a shared 10Gb/s or a large amounts of dedicated gigabit, it will differs.

    Obviously if you have a few shared 10Gb/s virtualized in a large number of interfaces and you open 8 connections at full speed, other users may be affected.

    Maybe try to open 2 channels in parallel to start with.

    Did you use compression? It seems your “sys” time is much larger than mine… Try to not compress and compare

  9. Hi Laurent

    Thanks for reply.

    Actually I tested with 1 file but w/out “-o Ciphers=arcfour” because I am not sure what is that for. I checked the ssh_config file, most of the options are commented. I did not use a compression option -C in the testing.

    Here is what i did.

    $ time echo “Progress\nput /u07/oradata/EXPDP/DP_FILE1.dmp” | sftp -B 260000 -R 512 srv005:/u01/app/oracle/reorg/.

    Should I run multiple files to see the performance ? or is there any other options i should use to increase the speed of the file transfer?

    I am sorry if my questions are not valid.

    Thank you
    -haris

  10. @Laurent Schneider
    Hi Laurent

    Thanks for reply.

    Actually I tested with 1 file but w/out “-o Ciphers=arcfour” because i am not sure what is that for. I checked the ssh_config file but most option are commented. I did not use -C option for compression.

    Here is what I did:

    $ time echo “Progress\nput /u07/oradata/DP_FILE1.dmp” | sftp -B 260000 -R 512 svr005:/u01/app/oracle/reorg/.

    Should i run multiple files to invoke parallelizing?

    I am sorry if my question is not valid.

    Thank you
    -haris

  11. @Laurent Schneider
    Hi Laurent

    Today I tested with 2 files, 2GB size each file. I did not use compression mode

    Result.

    srv001:oracle/reorg $ cat batch_transfer0.sh
    echo “Progress\nput /u07/oradata/EXPDP/DP_INTERLIVE_OFF_02.dmp” | sftp -B 262100 -R 512 srv005:/u01/app/oracle/reorg/. &
    echo “Progress\nput /u07/oradata/EXPDP/DP_INTERLIVE_OFF_03.dmp” | sftp -B 262100 -R 512 srv005:/u01/app/oracle/reorg/. &
    wait
    srv001:oracle/reorg $

    real 10m39.69s
    user 9m19.72s
    sys 9m0.88s

    Any comments or suggestions ?

    Thanks
    -haris

  12. Hi Haris,

    Your comments landed in my spam box, sorry about this…

    The cipher suite is the way you encrypt the traffic. arcfour is believed to be faster than the default 3des.

    The result will be impacted by your server and network usage… maybe try when there is little activity on the server

Leave a Reply

Your email address will not be published.


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>