Netcat is installed by default on most Linux and MacOSX systems. It provides a way of opening TCP or UDP network connections between nodes, acting as an open pipe thru which you can send any data as fast as the connection will allow, imposing no additional protocol load on the transfer. Because of its widespread availability and it’s speed, it can be used to transmit data between 2 points relatively quickly, especially if the data doesn’t need to be encrypted or compressed (or if it already is). However, to use netcat, you have to have login privs on both ends of the connection and you need to explicitly set up a listener that waits for a connection request on a specific port from the receiver. This is less convenient to do than simply initiating an scp or rsync connection from one end, but may be worth the effort if the size of the data transfer is very large. To monitor the transfer, you also have to use something like pv (pipeviewer); netcat itself is quite laconic.
How it works: On one end (the sending end, in this case), you need to set up a listening port:
Al ordinador on s'enviaran les dades:
$ pv -pet big.file | nc -q 1 -l 1234 <enter>
This sends the honkin.big.file thru pv -pet which will display progress, ETA, and time taken. The command will hang, listening (-l) for a connection from the other end. The -q 1 option tells the sender to wait 1s after getting the EOF and then quit.
On the receiving end, you connect to the nc listener
$ nc sender.net.uci.edu 1234 | pv -b > honkin.big.file <enter>
El fitxer big.file el podeu crear amb l'ordre dd:
$ dd if=/dev/zero bs=1024 count=10000000 of=big.file
(note: no -p to indicate port on the receiving side). The -b option to pv shows only bytes received.
Once the receive_host command is inititated, the transfer starts, as can be seen by the pv output on the sending side and the bytecount on the receiving side. When it finishes, both sides terminate the connection 1s after getting the EOF.
This arrangement is slightly arcane, but supports the unix tools philosophy which allows you to chain various small tools together to perform a task. While the above example shows the case for a single large file, it can also be modified only slightly to do recursive transfers, using tar, shown here recursively copying the local sge directory to the remote host. 3.6.1. tar and netcat
The combination of these 2 crusty relics from the stone age of Unix are remarkably effective for moving data if you don’t need encryption. Since they impose very little protocol overhead to the data, the transfer can run at close to wire speed for large files. Compression can be added with the tar options of -z (gzip) or '-j (bzip2).
The setup is not as trivial as rsync, scp, or bbcp, since it requires commands to be issued at both ends of the connection, but for large transfers, the speed payoff is non-trivial. For example, using a single rsync on a 10Gb private connection, we were getting only about 30MB/s, mostly because of many tiny files. Using tar/netcat, the average speed went up to about 100MB/s. And using multiple tar/netcat combinations to move specific subdirs, we were able to get an average of 500GB/hr, still not great (~14% of theoretical max), but about 5x better than rsync alone.
Note that you can set up the listener on either side. In this example, I’ve set the listener to the receiving side.
In the following example, the receiver is 10.255.78.10; the sender is 10.255.78.2.
First start the listener waiting on port 12378
[receive_host] $ nc -l receiver port_# | tar -xf -
$ nc -l 10.103.1.10 12378 | tar -xzf -
Below, local.interface is the sender interface (by IP # or hostname) you want to use. Often a server will have many and you will want to use a specific one.
[send_host]: $ tar -czvf - dir_target | nc -s sender receiver port_#
$ tar -czvf - fmri_classic | nc -s 10.255.78.2 10.255.78.10 12378
In this case, I’ve added the verbose flag (-v) to the tar command on the sender side so using pv is redundant. It also uses tar’s built-in compression flag (-z) to compress/decompress as it transmits. Depending on the bandwidth available to you and the CPUs of the hosts, this may actually slow transmission. As noted above, it’s most effective on bandwidth-limited channels.
You could also bundle the 2 together in a script, using ssh to execute the remote command. etc, etc, etc, etc.