I have two ZFS servers which have several TB of space. One of the great ZFS features is the ability to send one filesystem to another filesystem, on the current or another server. I will do this over ssh. One of my servers has a lot of spare space, so I figure I will duplicate my backups there.
The source
This server contains a Bacula Storage Daemon with access to about 27TB, with about 9TB free.
[dan@knew:~] $ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT system 27T 17.8T 9.21T 65% 1.00x ONLINE -
Take a snapshot
ZFS send works on a snapshot. That is, you take a snapshot and you can send that. With this command, I take a snapshot of the filesystem I wish to send.
[dan@knew:~] $ sudo zfs snapshot system/usr/local/bacula@SlocumSendInit Password: [dan@knew:~] $
The destination
This is the server which will received the data from knew:
[dan@slocum:~] $ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT system 16.2T 1.94T 14.3T 11% 1.00x ONLINE - [dan@slocum:~] $
I create a new filesystem to receive this data:
[dan@slocum:~] $ sudo zfs create system/backups [dan@slocum:~] $ sudo zfs create system/backups/knew [dan@slocum:~] $
I give my login permission to receive a ZFS filesystem:
[dan@slocum:~] $ sudo zfs allow -u dan create,receive,rename,mount,share,send system/backups/knew [dan@slocum:~] $
I think that set of permissions may be trimmed down and still work, but I didn’t want to experiment.
Sending
Before running the send, ssh to the receiver from the sender, so you take care of any messages such as this:
The authenticity of host 'slocum.unixathome.org (2001:470:1f07:9ea:2::1)' can't be established. ECDSA key fingerprint is d5:ce:76:f8:b9:f6:32:f5:c5:fa:71:0e:8f:44:9f:f5. Are you sure you want to continue connecting (yes/no)?
Then, on the sender, issue this command:
[root@knew:/usr/home/dan] # zfs send system/usr/local/bacula@SlocumSendInit | mbuffer -s 128k -m 1G 2>/dev/null | ssh dan@slocum 'mbuffer -s 128k -m 1G | zfs receive system/backups/knew/bacula' in @ 41.2 MiB/s, out @ 41.2 MiB/s, 4246 GiB total, buffer 0% fulll
Why use mbuffer? mbuffer is used to smooth out the transmission of data. 128k is chosen because that matches the standard record size of ZFS. We also use a 1GB buffer. We do mbuffer on both sides of ssh. I’m also guessing I could choose not to encrypt my ssh session because it is all on my home LAN. Or perhaps a faster encryption choice.
Load
Here’s an idea of the load imposed:
last pid: 4924; load averages: 2.91, 3.31, 3.59 up 3+07:27:04 22:26:28 217 processes: 3 running, 214 sleeping CPU: 20.2% user, 0.0% nice, 12.7% system, 4.2% interrupt, 62.9% idle Mem: 1668M Active, 615M Inact, 7767M Wired, 22M Cache, 21G Free ARC: 5280M Total, 258M MFU, 4785M MRU, 32M Anon, 52M Header, 153M Other Swap: 8192M Total, 8192M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 12396 root 1 102 0 38388K 5168K CPU3 3 25.2H 100.00% ssh 73828 10837 1 100 0 108M 69880K CPU2 2 30:51 88.57% postgres 12394 root 1 22 0 37624K 4304K pipewr 7 99:08 5.86% zfs 12395 root 3 21 0 1040M 1028M usem 4 73:21 3.47% mbuffer 3945 7080 1 21 0 180M 24296K zio->i 0 0:05 2.20% postgres 3944 dan 2 30 10 56792K 7576K nanslp 1 0:01 0.10% bscan 4923 dan 1 20 0 16596K 2880K CPU1 1 0:00 0.10% top
It’s been running for just over a day.
To decrease the toll imposed by ssh on your cpu, you can use a much more efficient cipher like arcfour. I managed to divide by 4 the cpu usage of ssh, and remove that bottleneck during zfs send/recv.
That sounds very useful. Thank you. What are the downsides to doing this? I’m guessing it much be done in /etc/ssh/sshd_config