Is deleting empty snapshots faster?

During the 2025-01-22 OpenZFS Production User Call, ‘atomic operations’ was mentioned with respect to previous blog post, might be expected.

In this post:

FreeBSD 14.1
r730-03

Let’s do a test

Speculation about empty snapshots was mentioned during the call. I did a test with 3000 snapshots

First, I create a filesystem for this testing:

[21:29 r730-03 dvl ~] % sudo zfs create data01/snapshots/deleting

Then, use jot to create 3000 snapshots:

[21:32 r730-03 dvl ~] % jot 3000 | xargs -I % -n 1 echo sudo zfs snapshot data01/snapshots/deleting@%  | head        
sudo zfs snapshot data01/snapshots/deleting@1
sudo zfs snapshot data01/snapshots/deleting@2
sudo zfs snapshot data01/snapshots/deleting@3
sudo zfs snapshot data01/snapshots/deleting@4
sudo zfs snapshot data01/snapshots/deleting@5
sudo zfs snapshot data01/snapshots/deleting@6
sudo zfs snapshot data01/snapshots/deleting@7
sudo zfs snapshot data01/snapshots/deleting@8
sudo zfs snapshot data01/snapshots/deleting@9
sudo zfs snapshot data01/snapshots/deleting@10
xargs: echo: terminated with signal 13; aborting

[21:32 r730-03 dvl ~] % jot 3000 | xargs -I % -n 1 sudo zfs snapshot data01/snapshots/deleting@%
[21:38 r730-03 dvl ~] % zfs list -r -t snapshot data01/snapshots/deleting | wc -l
    3001

That creation took 6 minutes.

Let’s delete.

[21:41 r730-03 dvl ~] % time sudo zfs destroy data01/snapshots/deleting@1%3000
sudo zfs destroy data01/snapshots/deleting@1%3000  0.01s user 0.01s system 0% cpu 39.270 total
[21:43 r730-03 dvl ~] %

40 seconds to destroy. That’s impression.

Next, more.

Let’s try 60,000 empty snapshots

For my next trick, let’s create 60,000 snapshots

[21:43 r730-03 dvl ~] % zfs list -r -t snapshot data01/snapshots/deleting | wc -l
no datasets available
       0
[21:45 r730-03 dvl ~] % jot 60000 | xargs -I % -n 1 sudo zfs snapshot data01/snapshots/deleting@%           
[4:56 r730-03 dvl ~] %

So that took 7 hours to create. Wow. It ran over night. It is now the 23rd.

How long does it take to list them?

[12:43 r730-03 dvl ~] % time zfs list -r -t snapshot data01/snapshots/deleting > ~/tmp/deleting
zfs list -r -t snapshot data01/snapshots/deleting > ~/tmp/deleting  2.54s user 48.47s system 99% cpu 51.042 total

50 seconds. That’s OK.

60,000 deletes starting on the 23rd

I started the delete. Actually, it’s not 60,000 deletes. It’s one destroy, of 60,000 snapshots.

[12:52 r730-03 dvl ~] % time sudo zfs destroy data01/snapshots/deleting@1%60000

After starting the above command, I started btop, and ran several zfs list. Eventually, the zfs list command hung (did not come back to the command line.

I stopped btop and tried running it again, it did not start and did not come back to the command line.

4 hours later

It’s been running about 4 hours now.

At present, I cannot ssh to the host:

[11:44 pro02 dan ~] % r730-03
kex_exchange_identification: read: Connection reset by peer
Connection reset by 10.55.0.143 port 22

I have tried to ssh into various jails on that host: same result.

There are plenty of Nagios notifications:

swap issues

Connecting to the console, I see lots of swap related messages.

The console is scrolling, so the system is still alive. I’m going to leave it for a bit longer.

NOTE: the zfs destroy command is not responding to CTL-t.

23:39

The zfs destroy started at about 12:52. It’s now 12.5 hours later…

The host is responding the pings:

[18:38 pro04 dvl ~] % ping r730-03                                              
PING r730-03.int.unixathome.org (10.55.0.143): 56 data bytes
64 bytes from 10.55.0.143: icmp_seq=0 ttl=63 time=5.167 ms
64 bytes from 10.55.0.143: icmp_seq=1 ttl=63 time=7.554 ms
^C
--- r730-03.int.unixathome.org ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 5.167/6.361/7.554/1.193 ms
[18:41 pro04 dvl ~] %

Still no ssh response.

I’m headed out for dinner, so we’ll check back later.

13:30 – the next day

This morning, all the existing ssh sessions have been disconnected. The host is no longer responding to pings. Attempts to ssh time out. Samba mounts have disconnected.

[12:52 r730-03 dvl ~] % time sudo zfs destroy data01/snapshots/deleting@1%60000
client_loop: send disconnect: Broken pipe
[22:59 pro02 dan ~] % ping r730-03  
PING r730-03.int.unixathome.org (10.55.0.143): 56 data bytes
Request timeout for icmp_seq 0
^C
--- r730-03.int.unixathome.org ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
[8:30 pro02 dan ~] % 



[12:21 pro02 dan ~] % r730-03

ssh: connect to host r730-03.int.unixathome.org port 22: Operation timed out
[8:31 pro02 dan ~] %

The console is still scrolling the swap_pager: indefinite wait buffer: bufobj 0: blkno: 7301: size: 12288 (for example).

None of the overnight backups succeeded (this host is the destination).