scp tuning

I twitted yesterday :

laurentsch
copying 1TB over ssh sucks. How do you fastcopy in Unix without installing Software and without root privilege?

I got plenty of expert answers. I have not gone to far in recompile ssh and I did not try plain ftp.

Ok, let’s try first to transfer 10 files of 100M from srv001 to srv002 with scp :

time scp 100M* srv002:
100M1 100% 95MB 4.5MB/s 00:21
100M10 100% 95MB 6.4MB/s 00:15
100M2 100% 95MB 6.0MB/s 00:16
100M3 100% 95MB 4.2MB/s 00:23
100M4 100% 95MB 3.4MB/s 00:28
100M5 100% 95MB 4.2MB/s 00:23
100M6 100% 95MB 6.4MB/s 00:15
100M7 100% 95MB 6.8MB/s 00:14
100M8 100% 95MB 6.8MB/s 00:14
100M9 100% 95MB 6.4MB/s 00:15

real 3m4.50s
user 0m27.07s
sys 0m21.56s

more than 3 minutes for 1G.

I got hints about the buffer size, about SFTP, about the cipher algorythm, and about parallelizing. I did not install new software and I have a pretty old openssh client (3.8). Thanks to all my contributors tmuth, Ik_zelf, TanelPoder, fritshoogland, jcnars, aejes, surachart, and the ones the will answer after the writting of this blog post…

Ok, let’s try a faster algorythm, with sftp (instead of scp), a higher buffer and in parallel
$ cat batch.ksh
echo "progress\nput 100M1" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M2" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M3" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M4" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M5" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M6" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M7" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M8" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M9" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
echo "progress\nput 100M10" | sftp -B 260000 -o Ciphers=arcfour -R 512 srv002&
wait
$ time batch.ksh
real 0m19.07s
user 0m12.08s
sys 0m5.86s

This is a 1000% speed enhancement 🙂

17 thoughts on “scp tuning

  1. John Scott

    Laurent

    Nice post, but in the first example you do the transfers serially, however in the sftp method you run multiple processes in the background.

    Did you try running the scp in the background in parallel too (I’d be interested in seeing the timings)?

    Thanks

    John.

  2. Laurent Schneider Post author

    Hi Nicolas,

    This may help on slow network, but not on fast network with high latency

    According to man ssh,
    Compression is desirable on modem lines and other slow
    connections, but will only slow down things on fast
    networks

    $ time (tar cvf - 100*|gzip|ssh srv002 "gzip -d -c|tar xvf -")
    real 3m49.61s
    user 0m21.63s
    sys 0m6.67s

  3. jimmyb

    I though scp was a wrapper around sftp. So what would cause scp to be so much slower?

  4. Noons

    In Aix, scp accepts a “-C” option that instructs ssh to use compressed data transmission. I think if you can use that and parallel, you’ll get similar results.

  5. Gwen Shapira

    1. scp is not a wrapper about sftp. They are separate protocols that use SSH as underlying security method. Specifically sftp is also a file system protocol that allows remote directory listing, while scp is not. Generally speaking scp is known to be faster.

    2. In this case, I suspect the main trick (other than 10 channels which gave most of the benefit) was the use of -B to give more bandwidth to the transfer. You don’t have this option (at least not easily) in scp.

  6. Laurent Schneider Post author

    @chen I started the test this morning with a huge file… but I know you are impatient to get an answer so I tried again with an export dump that is a 1.66Gb in size. SFTP is clearly faster.

    With no option at all, the simple possible test, with 2 different dumps files, both of 1.66G


    time sftp oracle@srv004ax:/u02/oradata/201001110940/aaa1.dmp

    real 1m30.56s
    user 0m40.82s
    sys 0m10.76s

    time scp oracle@srv004ax:/u02/oradata/201001110940/aaa2.dmp .

    real 3m42.98s
    user 0m45.85s
    sys 0m30.66s

    $ ls -lrt
    -rw-r----- 1 lsc dba 1781233135 May 19 15:10 aaa1.dmp
    -rw-r----- 1 lsc dba 1780086038 May 19 15:14 aaa2.dmp

    the files are different, the second a few bytes smaller, loaded after, and sftp is faster.

    Generally speaking scp is known to be faster
    Sometimes


    $ time scp localhost:/etc/hosts xxx
    hosts 100% 2483 2.4KB/s 00:00

    real 0m0.32s
    user 0m0.02s
    sys 0m0.01s

    $ time sftp localhost
    Connecting to localhost...
    sftp> cd /etc
    sftp> get hosts xxx
    Fetching /etc/hosts to xxx
    /etc/hosts 100% 2483 2.4KB/s 00:00
    sftp> bye

    real 0m32.18s
    user 0m0.01s
    sys 0m0.00s

    But this is because I am a slow typer 🙂

  7. haris

    Hi

    I tried your method but i did not get any performance improvement. Below is the timing to transfer 500MB. I am using HPUX 11.23. Any suggestions and guidance to improve file transfer?

    real 2m22.40s
    user 0m53.66s
    sys 1m11.24s

    Thank you.
    -haris

  8. Laurent Schneider Post author

    You should check what kind of bandwith your network offer. If you have a single 10Mb/s or a shared 10Gb/s or a large amounts of dedicated gigabit, it will differs.

    Obviously if you have a few shared 10Gb/s virtualized in a large number of interfaces and you open 8 connections at full speed, other users may be affected.

    Maybe try to open 2 channels in parallel to start with.

    Did you use compression? It seems your “sys” time is much larger than mine… Try to not compress and compare

  9. haris

    Hi Laurent

    Thanks for reply.

    Actually I tested with 1 file but w/out “-o Ciphers=arcfour” because I am not sure what is that for. I checked the ssh_config file, most of the options are commented. I did not use a compression option -C in the testing.

    Here is what i did.

    $ time echo “Progress\nput /u07/oradata/EXPDP/DP_FILE1.dmp” | sftp -B 260000 -R 512 srv005:/u01/app/oracle/reorg/.

    Should I run multiple files to see the performance ? or is there any other options i should use to increase the speed of the file transfer?

    I am sorry if my questions are not valid.

    Thank you
    -haris

  10. haris

    @Laurent Schneider
    Hi Laurent

    Thanks for reply.

    Actually I tested with 1 file but w/out “-o Ciphers=arcfour” because i am not sure what is that for. I checked the ssh_config file but most option are commented. I did not use -C option for compression.

    Here is what I did:

    $ time echo “Progress\nput /u07/oradata/DP_FILE1.dmp” | sftp -B 260000 -R 512 svr005:/u01/app/oracle/reorg/.

    Should i run multiple files to invoke parallelizing?

    I am sorry if my question is not valid.

    Thank you
    -haris

  11. haris

    @Laurent Schneider
    Hi Laurent

    Today I tested with 2 files, 2GB size each file. I did not use compression mode

    Result.

    srv001:oracle/reorg $ cat batch_transfer0.sh
    echo “Progress\nput /u07/oradata/EXPDP/DP_INTERLIVE_OFF_02.dmp” | sftp -B 262100 -R 512 srv005:/u01/app/oracle/reorg/. &
    echo “Progress\nput /u07/oradata/EXPDP/DP_INTERLIVE_OFF_03.dmp” | sftp -B 262100 -R 512 srv005:/u01/app/oracle/reorg/. &
    wait
    srv001:oracle/reorg $

    real 10m39.69s
    user 9m19.72s
    sys 9m0.88s

    Any comments or suggestions ?

    Thanks
    -haris

  12. Laurent Schneider Post author

    Hi Haris,

    Your comments landed in my spam box, sorry about this…

    The cipher suite is the way you encrypt the traffic. arcfour is believed to be faster than the default 3des.

    The result will be impacted by your server and network usage… maybe try when there is little activity on the server

Comments are closed.