Apr 172009
 

I was looking at RackMonkey and RackTables today. As part of the latter, I was installing PostgreSQL on my FreeBSD workstation. I failed. This had worked many times before, on many other servers. This was the first time I’d seen this particular situation.

# /usr/local/etc/rc.d/postgresql initdb
The files belonging to this database system will be owned by user "pgsql".
This user must also own the server process.
 
The database cluster will be initialized with locales
  COLLATE:  C
  CTYPE:    en_US.ISO8859-1
  MESSAGES: en_US.ISO8859-1
  MONETARY: en_US.ISO8859-1
  NUMERIC:  en_US.ISO8859-1
  TIME:     en_US.ISO8859-1
initdb: encoding mismatch
The encoding you selected (UTF8) and the encoding that the
selected locale uses (LATIN1) do not match.  This would lead to
misbehavior in various character string processing functions.
Rerun initdb and either do not specify an encoding explicitly,
or choose a matching combination.
 
 
Using the comand line, as the psql user:
 
[pgsql@subie ~]$ /usr/local/bin/initdb --encoding=utf-8 --lc-collate=C -D /usr/local/pgsql/data
The files belonging to this database system will be owned by user "pgsql".
This user must also own the server process.
 
The database cluster will be initialized with locales
  COLLATE:  C
  CTYPE:    en_US.ISO8859-1
  MESSAGES: en_US.ISO8859-1
  MONETARY: en_US.ISO8859-1
  NUMERIC:  en_US.ISO8859-1
  TIME:     en_US.ISO8859-1
initdb: encoding mismatch
The encoding you selected (UTF8) and the encoding that the
selected locale uses (LATIN1) do not match.  This would lead to
misbehavior in various character string processing functions.
Rerun initdb and either do not specify an encoding explicitly,
or choose a matching combination.

Say what? Googling did not help me. To the novice (that’s me), LOCALE and ENCODING are very odd things to read about. Nothing I read help.

What did help was someone else running the same thing. It ran fine. Both of us were on 7.2-PRERELEASE.

After much look, we discovered the difference was LANG. On my system:

$ echo $LANG
en_US.ISO8859-1
$

On his system, no value. My solution? comment out that line, rebuild the login.conf db (via: cap_mkdb /etc/login.conf), and reran the command. All fine. :)

# /usr/local/etc/rc.d/postgresql forceinitdb  
The files belonging to this database system will be owned by user "pgsql".
This user must also own the server process.                               

The database cluster will be initialized with locale C.
The default text search configuration will be set to "english".

creating directory /usr/local/pgsql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 40
selecting default shared_buffers/max_fsm_pages ... 24MB/153600
creating configuration files ... ok
creating template1 database in /usr/local/pgsql/data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the -A option the
next time you run initdb.

Success. You can now start the database server using:

    /usr/local/bin/postgres -D /usr/local/pgsql/data
or
    /usr/local/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start

Prior to reaching the above conclusion, we found another solution which also worked. As the pgsql user, I issued this command.

/usr/local/bin/initdb --encoding=utf-8 --locale=C -D /usr/local/pgsql/data

At this point, I’m not sure what solution is best for the FreeBSD port.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

  2 Responses to “LANG prevents PostgreSQL initdb”

  1. The relevant documentation section isn’t really obvious, how the server uses LANG is covered in http://www.postgresql.org/docs/8.3/static/locale.html if you didn’t find that one yet, and the appropriate background to make sense of that is at http://www.opengroup.org/onlinepubs/007908775/xbd/locale.html Other relevant environment variables here are include LC_ALL and LC_COLLATE; you should compare those between the server that worked and the one that didn’t.

    Rather than fiddling with the login info to change LANG, an alternate workaround would be to use the appropriate shell convention to get rid of LANG, like “unset LANG”, then run initdb. You could do that as part of a script that runs initdb if you think LANG might be set to something odd.

    There’s nothing wrong with manually specifying the encoding and locale at initdb time, as long as you recognize you won’t be honoring how the OS it setup that way. I think you’re stuck with digging into this a bit more before you can make the right decision for what you should do.

  2. Yes, ~pgsql/login.conf (on a FreeBSD) system would be more practical than the global file. That change has since been undone.

    I think we do need to look more into it, and I’m not sure what is at the bottom.