Where’s my backup?

Nagios was telling me:

FILE_AGE WARNING: /usr/home/rsyncer/backups/aws-1/postgresql/freshports.org.dump is 117636 seconds old and 3608113799 bytes 

That means the FreshPorts backup is more than a day old, and it should have been refreshed by now.

OK, let’s go look.

I log into the host known as aws-1 and check the files. They look fresh to me:

[rsyncer@aws-1 ~/backups/database-backup/postgresql]$ ls -lt
total 3525154
-rw-r--r--  1 rsyncer rsyncer       3963 Jul 16 02:14 globals.sql
drwxr-xr-x  2 rsyncer rsyncer          4 Jul 16 02:14 archive
-rw-r--r--  1 rsyncer rsyncer 3610129659 Jul 16 02:14 freshports.org.dump

I don’t see any rsync in progress (I checked using ps auwwx | grep rysnc).

Let’s try this manually

Looking at my backup script, the last thing I see is:

/usr/bin/ssh -i ${HOME}/.ssh/id_ed25519.dump       dbclone.int.unixathome.org rsync-backup-from-aws-1-dump.sh

That’s the rsyncer user telling the dbclone to run the rsync-backup-from-aws-1-dump.sh script. More on that later.

Let’s try this manually:

[rsyncer@aws-1 ~]$ /usr/bin/ssh -i ${HOME}/.ssh/id_ed25519.dump       dbclone.int.unixathome.org rsync-backup-from-aws-1-dump.sh
bash: line 1: /usr/local/sbin/rrsync: cannot execute: required file not found
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: remote command not found (code 127) at io.c(231) [Receiver=3.3.0]

What?

Let’s go to dbclone and check ~/.ssh/authorized_keys:

from="aws-1.startpoint.vpn.unixathome.org,192.0.2.130",command="/home/rsyncer/bin/rsync-backup-from-aws-1-dump.sh"       ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOuHMg9eBDIM3UXRzMW861DEfvXfwnZvRMC6LCz3dCcS rsyncer@aws-1.int.unixathome.org

This says:

  • When the connection is from aws-1.startpoint.vpn.unixathome.org,192.0.2.130…
  • invoke the script /home/rsyncer/bin/rsync-backup-from-aws-1-dump.sh
  • And allow this to happen only if the key they are using is the one with that hash shown

This tightly restricts the incoming ssh connection to one thing, and one thing only.

The script

Let’s look at the script:

[rsyncer@dbclone ~]$ cat  /home/rsyncer/bin/rsync-backup-from-aws-1-dump.sh
#!/bin/sh
#
# This file does a backup of each database on the server.
# It relies upon a file on the server to do the actual backup,
# then uses rsync to copy the files from the server to here.
#

# the ~/.ssh/authorized keys entry on the server must look like this:
#
# from="192.0.2.140",command="/usr/local/sbin/rrsync /usr/home/backups/" [ssh key goes here]
#
# invoking rrsync ensures the incoming rsync process is chrooted to /usr/home/dan/backups/
# Thus, BACKUPDIR is relative to that location.

BACKUPDIR=${HOME}/backups/aws-1

IDENTITY_FILE_RSYNC=${HOME}/.ssh/id_ed25519
SERVER_TO_RSYNC=aws-1.vpn.unixathome.org

cd ${BACKUPDIR}

#
# use rsync to get local copies
#
/usr/local/bin/rsync -e "/usr/bin/ssh -i ${IDENTITY_FILE_RSYNC}" --recursive -av --stats --progress --exclude 'archive' ${SERVER_TO_RSYNC}:/ ${BACKUPDIR}

This script invokes rsync over ssh, effectively pulling it from aws-1 to dbclone.

To recap

So far:

  • The rsyncer user on aws-1 run pg_dump
  • rsyncer invokes ssh to dbclone
  • On dbclone, an rsync is started, pulling the file to dbclone
  • Yes, this is circular, but there is sound logic behind it. More, later.

So far, so good.

Running the command

Let’s try running the script and see what happens here.

Here, I manually invoke each line of the script to see what’s going on:

[rsyncer@dbclone ~]$ /bin/sh
$ BACKUPDIR=${HOME}/backups/aws-1
$ 
$ IDENTITY_FILE_RSYNC=${HOME}/.ssh/id_ed25519
$ SERVER_TO_RSYNC=aws-1.vpn.unixathome.org
$ 
$ cd ${BACKUPDIR}
$ /usr/local/bin/rsync -e "/usr/bin/ssh -i ${IDENTITY_FILE_RSYNC}" --recursive -av --stats --progress --exclude 'archive' ${SERVER_TO_RSYNC}:/ ${BACKUPDIR}
bash: line 1: /usr/local/sbin/rrsync: cannot execute: required file not found
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(231) [Receiver=3.3.0]

Oh. WTF?

Now let’s look at the command being run back on aws-1.

The aws-1 command

Back on aws-1, we have this ~/.ssh/authorized_keys entry:

from="192.0.2,191.0.2.140",command="/usr/local/sbin/rrsync /home/rsyncer/backups/database-backup"  ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILcVvbs7FiGm1rmo5OxroXDlY7cytxh4Su8ZYz2u/TB2 rsyncer@dbclone.int.unixathome.org

Why the IP address twice? This host cannot resolve that IP address to a hostname. Although I see now there is an entry in /etc/hosts for that IP address.

This authorized_keys entry is similar to what we saw earlier.

By now, and I should have suspected this earlier, I’m looking at /usr/local/sbin/rrsync. Let me see:

$ ls -l /usr/local/sbin/rrsync
-r-xr-xr-x  1 root wheel 12865 Apr 28 04:18 /usr/local/sbin/rrsync

Yeah, it’s there. Looking:

[rsyncer@aws-1 ~]$ file /usr/local/sbin/rrsync
/usr/local/sbin/rrsync: Python script, ASCII text executable
[rsyncer@aws-1 ~]$ head /usr/local/sbin/rrsync
#!/usr/local/bin/python3.9

# Restricts rsync to subdirectory declared in .ssh/authorized_keys.  See
# the rrsync man page for details of how to make use of this script.

# NOTE: install python3 braceexpand to support brace expansion in the args!

# Originally a perl script by: Joe Smith  30-Sep-2004
# Python version by: Wayne Davison 

[rsyncer@aws-1 ~]$ 

Yeah, well, it’s python.

And I just updated Python on this host

I recently moved from Python 3.9 to Python 3.11 on this host:

[rsyncer@aws-1 ~]$ ls -l /usr/local/bin/python3.9
ls: /usr/local/bin/python3.9: No such file or directory
[rsyncer@aws-1 ~]$ pkg info -x python
py311-python-dateutil-2.9.0
python-3.11_3,2
python3-3_4
python311-3.11.9_1

Will reinstall bring in the right header?

[12:43 aws-1 dan ~] % sudo pkg install -f rsync
Updating local repository catalogue...
Fetching meta.conf: 100%    178 B   0.2kB/s    00:01    
Fetching data.pkg: 100%  305 KiB 312.7kB/s    00:01    
Processing entries: 100%
local repository update completed. 1123 packages processed.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

Installed packages to be REINSTALLED:
	rsync-3.3.0

Number of packages to be reinstalled: 1

385 KiB to be downloaded.

Proceed with this action? [y/N]: y
[1/1] Fetching rsync-3.3.0.pkg: 100%  385 KiB 394.2kB/s    00:01    
Checking integrity... done (0 conflicting)
[1/1] Reinstalling rsync-3.3.0...
[1/1] Extracting rsync-3.3.0: 100%
[12:43 aws-1 dan ~] % head 1 head /usr/local/sbin/rrsync
head: 1: No such file or directory
head: head: No such file or directory
==> /usr/local/sbin/rrsync <==
#!/usr/local/bin/python3.11

# Restricts rsync to subdirectory declared in .ssh/authorized_keys.  See
# the rrsync man page for details of how to make use of this script.

# NOTE: install python3 braceexpand to support brace expansion in the args!

# Originally a perl script by: Joe Smith  30-Sep-2004
# Python version by: Wayne Davison 

[12:44 aws-1 dan ~] % 

Success!

Run it!

Here we go!

[rsyncer@aws-1 ~/bin]$ /usr/bin/ssh -i ${HOME}/.ssh/id_ed25519.dump       dbclone.int.unixathome.org rsync-backup-from-aws-1-dump.sh
receiving incremental file list
postgresql/

That’s still in progress.

Talk about a better shebang

So why not use /usr/local/bin/python instead of /usr/local/bin/python3.9

Good question.

That fix is handled by shebangfix and this line:


SHEBANG_FILES= support/rrsync

… in net/rsync/Makefile

I’m not sure why it works the way it does.

Why ssh, then rsync

You might ask:

Why do the rsync from the destination (pull) and not from the source (push).

This is to protect the destination, at home, in my basement. If the source gets compromised, all they can do is run another rsync.

So why not run the rsync from the destination anyway? Why invoke it from the source?

Timing.

I want the rsync run when the backup is done. The simple way to know when it’s done is to let the remote host run it right after the dump.

It’s done

While typing the above, it finished:

[rsyncer@aws-1 ~/bin]$ /usr/bin/ssh -i ${HOME}/.ssh/id_ed25519.dump       dbclone.int.unixathome.org rsync-backup-from-aws-1-dump.sh
receiving incremental file list
postgresql/
postgresql/freshports.org.dump
  3,610,129,659 100%   21.45MB/s    0:02:40 (xfr#1, to-chk=1/4)
postgresql/globals.sql
          3,963 100%    4.88kB/s    0:00:00 (xfr#2, to-chk=0/4)

Number of files: 4 (reg: 2, dir: 2)
Number of created files: 0
Number of deleted files: 0
Number of regular files transferred: 2
Total file size: 3,610,133,622 bytes
Total transferred file size: 3,610,133,622 bytes
Literal data: 349,726,356 bytes
Matched data: 3,260,407,266 bytes
File list size: 141
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 480,719
Total bytes received: 350,029,118

sent 480,719 bytes  received 350,029,118 bytes  2,080,177.07 bytes/sec
total size is 3,610,133,622  speedup is 10.30
[rsyncer@aws-1 ~/bin]$ 

More importantly, Nagios is also happy: ** RECOVERY alert – dbclone/check_backup_files PRIMARY is OK **

Thank you for coming to my TED talk.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top