Aug 032020
 

I noticed a problem with a newly-created freshports daemon script: starting it would sometime freeze the terminal session.

The rc.d script

The rc.d script was fairly straight forward:

#!/bin/sh
# $FreeBSD$

# PROVIDE: freshports
# REQUIRE: LOGIN cleanvar
# KEYWORD: shutdown

#
# Add the following lines to /etc/rc.conf to enable freshports:
# freshports_enable (bool):	Set to "NO" by default.
#				Set it to "YES" to enable freshports
#

. /etc/rc.subr

name="freshports"
rcvar=${name}_enable

pidfile="/var/run/${name}/${name}.pid"
freshports_user="freshports"
freshports_command="/usr/local/libexec/freshports-service/freshports.sh"
command="/usr/sbin/daemon"

load_rc_config $name

: ${freshports_enable:=NO}
: ${freshports_user:=freshports}
: ${freshports_group:=freshports}
: ${freshports_syslog_facility:=local3}

required_files="/usr/local/etc/freshports/freshports.sh"

start_precmd=freshports_prestart

freshports_prestart()
{
	# create the file pid, and directory, with correct permissions
	if [ ! -e ${pidfile} ]; then
		install -o ${freshports_user} -g ${freshports_group} /dev/null ${pidfile};
	else
		chown ${freshports_user}:${freshports_group} ${pidfile};
	fi
}

command_args="-P ${pidfile} -t ${name} -T ${name} -l ${freshports_syslog_facility} ${freshports_command}"

run_rc_command "$1"

Nothing odd there, right?

This first happened a week or so ago. I moved on it and decided to look into later.

But wait, there’s more

This issue came up again today. The system would be processing incoming commits and it would hang. It would get stuck on deleting a file. I’d see this in ps auwwx output:

freshports  8452  0.0  0.0 10980  2336  -  IJ   18:54   0:00.00 /bin/rm /var/db/ingress/message-queues/incoming/2020.08.03.11.07.37.000002.ea0b4a2a7ed172b4618e09b74c3182e035cd6de2.xml

I would check the file, it would be no longer on disk. So why is it hanging?

I checked the code in question, and it looked like this:

${RM} ${file}

Hmmm, that’s the only use of ${RM} in the whole script.

Let’s try rm instead.

Nope. That did not help. And my ssh session is frozen. What’s up with that?

Wait, what about rm -f?

That worked!

Ahh, it’s a permissions issue.

The file is in a directory which is chgrp ingress:freshports as shown here:

$ ls -ld /var/db/ingress/message-queues/incoming
drwxrwxr-x  2 ingress  freshports  51 Aug  3 19:48 /var/db/ingress/message-queues/incoming

But the file is:

-rw-rw-r--  1 ingress  ingress  720 Aug  3 18:54 /var/db/ingress/message-queues/incoming/2020.08.02.16.45.14.000000.fa3e16b820913309f1078dcefb69084a3ee5564b.xml

The facts:

  1. the script runs as the freshports user
  2. that user does not have write access on the file
  3. that user has write access on the directory

I know how this came about, and it is because of an unusual work flow.

But first, a test, to demonstrate

Create a directory where I have read/write permissions.

[dan@empty:~] $ mkdir testing
[dan@empty:~] $ sudo chown root:dan testing
[dan@empty:~] $ ls -ld testing
drwxr-xr-x  2 root  dan  2 Aug  3 20:48 testing
[dan@empty:~] $ sudo chmod g+w testing
[dan@empty:~] $ ls -ld testing
drwxrwxr-x  2 root  dan  2 Aug  3 20:48 testing

Create a file where I have no write permissions:

[dan@empty:~] $ sudo touch testing/file
[dan@empty:~] $ ls -l testing/file 
-rw-r--r--  1 root  dan  0 Aug  3 20:49 testing/file

Deleting it gets me a prompt:

[dan@empty:~] $ rm testing/file
override rw-r--r-- root/dan uarch for testing/file? n

The above override appears in my logs (see below):

Using -f suppresses the prompt:

[dan@empty:~] $ rm -f testing/file
[dan@empty:~] $ 

The logs

In the logs, I would found the follwing. I have removed Aug 3 19:39:22 devgit-ingress01 freshports[51876]: from the start of each line:

'-rw-rw-r--  1 ingress  ingress  720 Aug  3 18:54 /var/db/ingress/message-queues/incoming/2020.08.02.16.45.14.000000.fa3e16b820913309f1078dcefb69084a3ee5564b.xml'
'drwxr-xr-x  2 freshports  freshports  2 Jul 17 17:46 /var/db/freshports/message-queues/incoming/'
'drwxr-xr-x  2 freshports  freshports  566 Aug  3 19:39 /var/db/freshports/message-queues/recent/'
removing /var/db/ingress/message-queues/incoming/2020.08.02.16.45.14.000000.fa3e16b820913309f1078dcefb69084a3ee5564b.xml
override rw-rw-r-- ingress/ingress uarch for /var/db/ingress/message-queues/incoming/2020.08.02.16.45.14.000000.fa3e16b820913309f1078dcefb69084a3ee5564b.xml? removal completed

The unusual work flow

Usually, these messages originate in this directory:

$ ls -ld /var/db/ingress/message-queues/spooling/ 
drwxr-xr-x  2 ingress  freshports  2 Aug  3 18:55 /var/db/ingress/message-queues/spooling/

And are then mv‘d to this directory:

$ ls -ld /var/db/ingress/message-queues/incoming/
drwxrwxr-x  2 ingress  freshports  2 Aug  3 20:03 /var/db/ingress/message-queues/incoming/

Over the past week or so I’ve been running tests which were dumping messages into the testing and testing-new directories for comparison purposes. I would process the same git commit into XML using two different versions of the same script.

Those directories were chown ingress:ingress.

The files created in those directories were also chown ingress:ingress.

The directories are now chown ingress:freshports.

and that’s it

I should have paid closer attention to the files and that would have clued me in early to the cause of the problem.

For now, the script will do a rm -f and the directories will be chown ingress:freshports.

Thanks for coming to my TED talk.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive