Log in



Archive for February, 2007

Hetzner down – a success story

February 22nd, 2007 by Peter

For the hosting services of my company, I have the most effective monitoring solution – a human. Some of our customers check their emails every minute, so I get immediate notification if something fails.

On 12:15 today, I got the information that all of our machines hosted at Hetzner are not reachable. I checked it, and the IP connectivity from both WIN and the 1&1 subnets was really broken. The Hetzner support hotline was (immediatly) not reachable, and also the Hetzner homepage was down. Obviously, this was no specific problem with our machines.

15 minutes (!) later, the special Hetzner status page gave a first information about router problems in there data center. Another 15 minutes later, everything was up again.

Bugs happen, and even the biggest hosters have to deal with them. What I really liked here:

  • A status page located at a competely different network.
  • Precise technical informations.
  • More or less fast reaction time.

Hetzner would be even a little bit better if users like me could alarm them with some online mechanism.

WS-* specifications overview

February 22nd, 2007 by Peter

You might have noticed that I added some new information on troeger.eu. The most interesting part for you could be the WS-* overview page, which is an additional feature for my Web service specifications tutorial. It provides an automatically generated list of WS-* specifications from W3C. The idea is to have a better overview of the many different specifications available today.

The generator script is written in Python, and relies on the common document formatting in W3C. In theory, the W3C document template allows an automated parsing of all specifications. In practice, there are different template versions, and the authors mis-use the tagging. Therefore, the list is far from being perfect, so I would like to know if there is any interest in further improvements on this overview …

Debian + Postfix + Mailman + virtual domains + own “lists” domain

February 19th, 2007 by Peter

Even though there is plenty of information about Mailman with Postfix, it needs some time to find the correct configuration for your environment. In our case, we have a Postfix with virtual alias domains, and an own domain for mailing lists (“lists.example.org”). The available documentation gives you the following choices in our special case:

  • Add some magic reg-exp stuff to the different config files, in order to detect emails to the ‘lists’ domain (see here).
  • Use postfix_to_mailman.py as transport table mechanism, as described in the Postfix Debian README. This is not the best solution, since you have no virtual alias table entries for your list email addresses. If your mail server rejects unknown recipient addresses, all mails to lists are denied.
  • Let mailman generate virtual and alias maps files, and include them in your postfix configuration. If you declared Postfix to be your Mailman MTA, then even the Postfix reloading is handled. You need to extend your maps with the according entries:


    alias_maps =
    hash:/etc/aliases hash:/var/lib/mailman/data/aliases

    virtual_alias_maps =
    hash:/etc/postfix/virtual hash:/var/lib/mailman/data/virtual-mailman

This solution works great for us. Every time you add / remove a list, Mailman will update its alias (and the according hash) files. Postfix detects this after a short period without manual reload. In order to generate the correct virtual-mailman file, and not only the aliases file, you need to tell mailman (/etc/mailman/mm_cfg.py) your lists domain:


MTA='Postfix'
POSTFIX_STYLE_VIRTUAL_DOMAINS = ['lists.example.org']

If you want to get the first version of the files, without changes to your list, use /var/lib/mailman/bin/genaliases.

The future of computer science education

February 7th, 2007 by Peter

While surfing the web, I found some great thoughts about the current status of computer science education in general:

http://www.bcs.org/server.php?show=ConWebDoc.9662

It more or less correlates with the idea of the institute I am working at, meaning that you need interdisciplinary IT professionals with an open mind and a close relation to the users, instead of algorithm experts with perfect programming skills and no overall system understanding.

Monitoring your daemons

February 6th, 2007 by Peter

One of the common problems in Unix administration is the undetected crash of some system daemon in your mail or web processing chain. On our system, CLAMAV is usually a good candidate for startup problems after an automated virus database refresh.

Lately, after a crash of the clamav-milter daemon, Postfix rejected all mails with a temporary failure, since this is the default behavior for non-working milter plugins.

We looked for a monitoring solution, and beside Mon with it’s tricky configuration syntax, I found monit in Debian. There are several good configuration examples, and I was able to copy and paste a config file for the most important daemons only in a minute:

[source:C]
set daemon 60
set logfile syslog facility log_daemon
set mailserver some.other.mailhost,
localhost

set eventqueue
basedir /var/monit
slots 100

set mail-format { from: root@monitored.host }

set alert peter@some.other.mailhost

check system monitored.host
if loadavg (1min) > 4 then alert
if loadavg (5min) > 2 then alert
if memory usage > 75% then alert
if cpu usage (user) > 70% then alert
if cpu usage (system) > 30% then alert
if cpu usage (wait) > 20% then alert

check process postfix with pidfile /var/spool/postfix/pid/master.pid
group mail
start program = “/etc/init.d/postfix start”
stop program = “/etc/init.d/postfix stop”
if failed port 25 protocol smtp then restart
if 5 restarts within 5 cycles then timeout

check process courier-imap with pidfile /var/run/courier/imapd.pid
start program = “/etc/init.d/courier-imap start”
stop program = “/etc/init.d/courier-imap stop”
if 5 restarts within 5 cycles then timeout
if failed port 143 type TCP protocol IMAP then restart

check process courier-imap-ssl with pidfile /var/run/courier/imapd-ssl.pid
start program = “/etc/init.d/courier-imap-ssl start”
stop program = “/etc/init.d/courier-imap-ssl stop”
if 5 restarts within 5 cycles then timeout
if failed port 993 type TCPSSL protocol IMAP then restart

check process courier-pop with pidfile /var/run/courier/pop3d.pid
start program = “/etc/init.d/courier-pop start”
stop program = “/etc/init.d/courier-pop stop”
if 5 restarts within 5 cycles then timeout
if failed port 110 type TCP protocol POP then restart

check process courier-pop-ssl with pidfile /var/run/courier/pop3d-ssl.pid
start program = “/etc/init.d/courier-pop-ssl start”
stop program = “/etc/init.d/courier-pop-ssl stop”
if 5 restarts within 5 cycles then timeout
if failed port 995 type TCPSSL protocol POP then restart

check process spamd with pidfile /var/run/spamd.pid
start program = “/etc/init.d/spamassassin start”
stop program = “/etc/init.d/spamassassin stop”
[/source]

But for the crashed clamav-milter, it was not possible to add such a statement ;-( The reason ?

Monitneeds in any case a valid PID file. Both freshclam and clamav-milter on my system create such a file, but use /lib/lsb/init-functions for a daemon start procedure (“start_daemon”), instead of using the normal start-stop-daemon tool in Debian. The LSB library leads to a PID file with a dash (“-4711″) before the PID number in it, which is not understood by Monit. So far, I am not sure if the init-functions library has a bug, or if Monit should be able to handle such a PID file …

  • You are currently browsing the troeger.eu blog archives for February, 2007.