Working with qmail's log files

General
- ‣Select stylesheet...
- Home
- Mailing Lists
- ‣Other qmail sites
- ‣Program home pages
  - daemontools
  - ucspi-tcp
  - djbdns
  - cdb
  - qmail
  - vpopmail
  - ClamAV
  - SpamAssassin
  - maildrop
  - simscan
  - ucspi-ssl
  - dovecot
  - courier-authlib
  - courier-imap
  - autorespond
  - ezmlm-idx
  - qmailadmin
  - SquirrelMail
- List of qmail Consultants
- Emailing me directly
- Privacy policy
- Copyright
Information
- ‣qmailrocks
  - Upgrading from...
  - My thoughts about...
- ‣HOWTO
- ‣Understand...
- ‣ClamAV
- ‣Courier-IMAP and authlib
- ‣daemontools
- ‣djbdns
- dovecot
- DSPAM
- simscan
- ‣Squirrelmail
  - Don't use SMTP
- ucspi-tcp
- ‣vpopmail
- PDF: Anatomy of a qmail system
Patches
- ‣Combined Patch
  - Details
  - Download list
- Domainkeys add-on
- Individual Patches
- validrcptto.cdb patch
- AUTH_SET patch
- AUTH_CDB patch
Scripts

http:// qmail.jms1.net / logfiles.shtml

A lot of people ask questions about how to "track" emails as they travel through a qmail system, or they want a list of every message sent through the server (who it was from and who it was sent to), or just want counts of how many messages were processed. All of this information is sitting there in the log files, if you know how to read them.

Where are the log files?

This is actually the biggest problem for most people. There is no single standard location for log files, because there is no standard way to run qmail. I will try to describe how to find the log files on your own system, but most of the work is up to you.

If you are running qmail under daemontools (i.e. you have a directory like /service/qmail/ or /service/qmail-send/ and you control qmail by editing a "run" script in this directory) then your service directory (again, your /service/whatever directory) will contain a directory called "log", which will also contain a "run" script. By reading this "log/run" script you can tell where the log files are being sent.

If you are running qmail under some other scheme (such as the "sysvinit" scripts used by Redhat, Solaris, and many other distributions) you should find the script which is responsible for starting qmail-send. (This will probably be in the /etc/init.d directory if you are running "sysvinit" scripts.) This script may have its own commands, or it may just run one of the scripts in /var/qmail/boot.

In any event, once you find the script which actually starts qmail, examine the command line. You will see something like this:

exec env - PATH="/var/qmail/bin:$PATH" \
qmail-start ./Maildir/ splogger qmail

The first term after "qmail-start" is the default delivery location- usually a Maildir within the user's home directory. The next term (which is optional) tells how to handle the logs. qmail-send sends its log messages to its "standard out" channel. The daemontools "multilog" program catches this output and writes it to a "current" file in a given directory, while the "splogger" program catches this output and sends it to the syslog service.

If there is no term which tells where the logs should be sent, the logs will simply "fall through" to whatever is handling the standard out channel for the service. If there is no log service for a given service, any log entries will "fall through" to a program called readproctitle, which adds the messages to a string which is visible on its command line with the ps command. In some cases if a service directory has been damaged to the point where a log service does not exist or cannot start, this is another place to look for log output.

When using the splogger program, you can specify a "tag" to be prepended to each log line. In the example above, this tag is "qmail". You could also specify a numeric "facility" code- the default is 2, which means "mail".

To figure out where the log lines are going, you should check your /etc/syslog.conf file to find out where "mail" messages are being sent. In many cases, they will be going to a file such as "/var/log/mail", but don't blindly trust that- your /etc/syslog.conf will tell you exactly where to look.

If you are in a position to be able to choose how to handle the logs for a new or existing server, you should never use syslog. The syslog mechanism is not reliable- under heavy loads it will drop log entries. It's also very slow. The multilog program does not suffer from either of these problems.

What log files are available?

A qmail system is made up of several different programs, all running at the same time and doing their own little part of the overall "mail server" job. Each of these programs generally has its own log file, although if you are using syslog, the logs may be combined together. By understanding what each program does, you can easily tell which log file to look at when you need to check something.

The qmail-send program (which normally runs as a daemontools service called "qmail-send", or maybe just "qmail") manages the queue, and starts all delivery processes. The delivery programs run as children of qmail-queue, and therefore their output is contained with the qmail-queue output. If you have a problem with messages being "stuck" in the queue, this log file should be the place to look.

The qmail-smtpd program handles incoming SMTP traffic. If you have a problem with messages not being properly accepted or rejected from other machines, this log file should be the first place to look. Note that if you have multiple SMTP services on the machine (for example, a standard SMTP service on port 25, an SSL-enabled service on port 465, and an AUTH-only service on port 587) each service will have its own log file (again, unless they are being combined by the syslog mechanism.)

If you are using qmail-scanner as a way to have your incoming mail scanned for viruses and/or spam content, the qmail-scanner program also generates a log file of its own, which is totally separate from the qmail-smtpd log. You will find this file in your qmail-scanner directory, with the name qmail-queue.log. This will have a line-by-line description of everything the qmail-scanner program does.

If you are using simscan as a way to have your incoming mail scanned for viruses and/or spam content, you will normally find simscan's logs in the qmail-smtpd service log.

If you are using clamav, the "clamd" program keeps its own log file. The /etc/clamd.conf file should have a "LogFile" line which tells you where the log is being written.

If you are using spamassassin, its "spamd" program also generates logs. By default it sends the logs to the syslog with the "mail" facility code, but by adding a "-s" option to spamd's command line it is possible to send the log output to a file, or to the "standard error" channel (which makes it easy to run spamd under daemontools.) You should examine the command line for spamd in order to figure out where its log file will be found.

Any POP3 or IMAP servers will also be generating their own log entries. If you or your users are having a problem with POP3 or IMAP, those would be the place to look.

Timestamps

When dealing with log files generated by multilog, you will encounter strange looking timestamps which look like an "@" character followed by a bunch of hex digits. This format is called TAI64N. The first sixteen hex digits represent a 64-bit number which counts how many seconds have passed since the beginning of 1970, while the remaining eight hex digits are a 32-bit number which counts nanoseconds since the beginning of the second.

At first glance it sounds like a more precise version of the traditional unix time() function, and for most purposes it can be used as such (subject to the resolution of your system clock.) However, instead of interpreting the count of seconds using the UTC notation, it uses TAI notation. The difference is that TAI does not have "leap seconds", which means that when doing arithmetic on TAI timestamps, these "leap seconds" do not have to be taken into account. This web page on djb's web site explains the difference in more detail. The idea is this- if you try to compare TAI and UTC timestamps, there will be a difference of several seconds (because of the leap seconds added in the UTC system but not in the TAI system.)

While these timestamps are very useful for a computer, they're not very friendly to work with from a human standpoint. The daemontools package includes the tai64nlocal program, which reads lines starting with TAI64N timestamps (such as log files) and write the same data with the timestamps converted from TAI to human-readable UTC. Here's an example:

$ cat zzz
@40000000433225833b6e1a8c tcpserver: status: 5/30
@40000000433225833b6e2644 tcpserver: pid 26162 from 194.206.24.40
@40000000433225840c85ba04 tcpserver: ok 26162 a.mx.jms1.net:209.114.200.128:25 :194.206.24.40::1521
@40000000433225840c8f0cbc rblsmtpd: 194.206.24.40 pid 26162: 451 We do not accept mail from IP addresses without reverse DNS.
@40000000433225852a9ada4c tcpserver: status: 6/30
@40000000433225852a9ae604 tcpserver: pid 26163 from 193.123.2.227
@40000000433225852aa997bc tcpserver: ok 26163 a.mx.jms1.net:209.114.200.128:25 pixelwww.pixelpower.com:193.123.2.227::4232
@40000000433225852aa9a374 rblsmtpd: 193.123.2.227 pid 26163: 553 Sent mail to honeypot qmmycemglyiuq@delete.net on 2005-01-04
@400000004332258538eae27c tcpserver: end 26163 status 0
@400000004332258538eaea4c tcpserver: status: 5/30
$ cat zzz | tai64nlocal
2005-09-21 23:31:05.997071500 tcpserver: status: 5/30
2005-09-21 23:31:05.997074500 tcpserver: pid 26162 from 194.206.24.40
2005-09-21 23:31:06.210090500 tcpserver: ok 26162 a.mx.jms1.net:209.114.200.128:25 :194.206.24.40::1521
2005-09-21 23:31:06.210701500 rblsmtpd: 194.206.24.40 pid 26162: 451 We do not accept mail from IP addresses without reverse DNS.
2005-09-21 23:31:07.714791500 tcpserver: status: 6/30
2005-09-21 23:31:07.714794500 tcpserver: pid 26163 from 193.123.2.227
2005-09-21 23:31:07.715757500 tcpserver: ok 26163 a.mx.jms1.net:209.114.200.128:25 pixelwww.pixelpower.com:193.123.2.227::4232
2005-09-21 23:31:07.715760500 rblsmtpd: 193.123.2.227 pid 26163: 553 Sent mail to honeypot qmmycemglyiuq@delete.net on 2005-01-04
2005-09-21 23:31:07.954917500 tcpserver: end 26163 status 0
2005-09-21 23:31:07.954919500 tcpserver: status: 5/30

These are actual log entries taken from my own server at the time I was writing this page. My server does not accept incoming mail from IP addresses without reverse DNS names, and I have my own blacklist which the spammers are kind enough to update for me, by sending mail to addresses in (at that time) the "@delete.net" domain, although I now use the "@dont-spam.us" domain for this purpose. The same blacklist also includes the IP addresses of machines which try to send viruses to my machine.

The multilog program

The multilog program works by reading the output from another program (usually a service running under daemontools) and writing the output to one or more log files. The program is capable of filtering the data, adding tai64n timestamps to each line, and maintaining a directory where log files are automatically created, cut off when they reach a certain size, and old files deleted when too many exist.

The official multilog web page gives a brief explanation of each feature of the program. Below is information about how most people end up using multilog- as part of an automatically created log service in conjunction with qmail or djbdns.

The "t" command (which, if present, must be the first option) tells multilog to add a tai64n timestamp to the beginning of each line. If this option is not there, the lines sent to the log file(s) will be exactly as sent by the service being logged.

The "./main" portion specifies a self-rotating directory. multilog will append each line of output to a file called "current" within this directory. When the file approaches a certain maximum size (which defaults to 100,000 bytes) multilog will rename "current" to a file whose name is a tai64n timestamp of the time the rename was done, and create a new "current" file. At this time it also counts how many old log files are in the directory, and if more than a certain number exist (default 10) it will delete the oldest files.

You can also add the "s" command to specify the maximum size of each file before it gets renamed, and the "n"> command to specify the maximum number of old log files which may exist before they are deleted. For example...

This allows each file to grow up to 1MB (1048576 bytes) and allows up to 16 files to exist before the oldest files are deleted.

Interpreting the logs

Once you've found the logs you're looking for, and understand how the timestamps work, the only remaining step is to actually read the messages being sent to the logs. The biggest problem that most people have is that they don't understand what they're reading, because they don't understand how the programs which are generating the log entries actually work.

Below I will show samples of what typical log entries look like and what they actually mean. Note that the examples will say TIMESTAMP at the beginning of each line- this is because different log files may have different types of timestamps (i.e. TAI64N, the output from tai64nlocal, or the *nix-format timestamps added by most syslog mechanisms) and I don't want to have to keep explaining over and over that "your timestamps may be different."

qmail-send logs

The qmail-send service logs the activities of the qmail-send process, along with any messages generated by the qmail-local and qmail-remote processes it will generate in order to actually process the individual deliveries. These examples will track the lifetime of a particular message on its journey through the queue.

The mtrack script reads the log file, uses the information in each line to correlate which lines correspond to the same messages, and prints those lines back out, grouped by the message they're talking about. Trust me, this script will make your life MUCH easier.

This message means that qmail-send has just become aware that the indicated message number exists in the queue. This usually happens shortly after qmail-queue is finished adding the message to the queue.

The number (12345 in this example) is the "message number". Technically, this is the inode number of the file within the /var/qmail/queue/mess directory which contains the text of the message, and the filename of the various control files within other parts of the /var/qmail/queue directory.

Because inode numbers are guaranteed to be unique within any given filesystem, only one message may have a given "message number" at any given time. However, you will often see cases where the same "message number" is used for multiple messages. This happens when qmail-send is finished with a given message and deletes it from the queue, and another incoming message happens to use the same inode.

TIMESTAMP info msg 12345: bytes 3189 from <sender@domain.xyz> qp 27763 uid 46

Remember that qmail-send has two jobs- classifying each recipient of an incoming message as "local" or "remote", and scheduling deliveries. This message means that the classifier portion of qmail-send (or if you're using the EXT_TODO patch, the qmail-todo process) has finished classifying the recipients and the message is ready to be delivered to each recipient.

The first number (12345 in this example) is the "message number", and will match the "message number" from the "new msg" message above. This number can be used to tie log entries together and follow the progress of any given message in the queue.

The email address is the envelope sender of the message- usually this is the argument of the MAIL FROM command which was given to qmail-smtpd, although if a message originated using some other interface (i.e. qmail-inject) this value can be specified using some other mechanism. Regardless, this is the address to which any bounce messages will be sent if any of the recipients are unable or unwilling to accept the message.

The qp number (27763 in this example) is the process ID of the qmail-queue process which added the message to the queue. This can be used to link the qmail-queue and qmail-send logs together if needed.

The uid number (46 in this exmple) is the numeric uid under which the qmail-queue process was running when the message was added to the queue. In the case of an SMTP server using qmail-scanner, this will be the userid as which qmail-scanner-queue.pl runs.

TIMESTAMP starting delivery 37051: msg 12345 to local recip@domain
TIMESTAMP starting delivery 37051: msg 12345 to remote recip@domain

These messages tell you that a delivery attempt has been started. Each delivery attempt is assigned a number (37051 in this example) which is simply a sequence number which starts with 1 when qmail-send starts. The message tells you whether this is a local or remote delivery, and the recipient's email address.

TIMESTAMP delivery 37051: success: did_1+0+0/
TIMESTAMP delivery 37051: deferral: Sorry,_I_wasn't_able_to_establish_an_SMTP_connection._(#4.4.1)/
TIMESTAMP delivery 37051: failure: 1.2.3.4_does_not_like_recipient./Remote_host_said:_550_recip@domain_User_unknown/Giving_up_on_1.2.3.4./

These messages tell you when a delivery attempt is finished, whether the delivery attempt was successful or not, and any additional information about the delivery. The delivery number (37051 in this example) tells you which delivery is finished, and allows you to correlate this line with the appropriate starting delivery line (above.)

Note that the additional information (the red portion at the end of each line in the examples above) is "cleaned" before being sent to the log, so that it's safe to log without allowing a malicious remote host to falsify log entries. Any spaces in the message are replaced with "_", and any newlines are replaced with "/". For example, the third message above actually started out as three lines of extra information.

This message tells you that the indicated "message number" (12345 in this example) is being deleted from the queue. If you are "following" a message through the queue, this message is the end- any messages after this one which may refer to the same "message number" are in fact referring to a new message which happens to be using the same inode number as the old one.

TIMESTAMP status: local 1/30 remote 7/50
TIMESTAMP status: qmail-todo stop processing asap
TIMESTAMP status: exiting

Any time a delivery process (i.e. qmail-local or qmail-remote starts or ends, this message is sent to the log. In this example, there are 30 concurrent local delivery slots available (i.e. the concurrencylocal control file) and 1 of those slots is being used, and there are 50 concurrent remote delivery slots available (the concurrencyremote control file) and 7 of those slots are being used.

When you tell qmail-send to shut itself down, it will add exitasap to the end of each status message (as shown here.) When all of the deliveries are done, you will see one or two last messages- you will only see the one referring to qmail-todo if you are using the EXT_TODO patch, but the exiting message is the last thing qmail-send does before stopping.

Note that there is no "starting" message when qmail-send starts- the first thing it logs when it starts up is a "status: local 0/x remote 0/x" line.

These status: messages do not have no direct correlation to any individual message, although if you want to track the timestamps they can give you an idea of how busy the server was at the time.

qmail-smtpd logs

The qmail-smtpd logs are more complicated, because many programs are sending log entries into the file. I will cover the most common programs here.

The strack script reads the log file, uses the information in each line to correlate which lines correspond to the same connections, and prints those lines back out, grouped by the connection they're talking about. Trust me, this script will make your life MUCH easier.

Useful Scripts

mtrack is a script which reads the log output from qmail-send and groups the lines together by the message they correspond with, making it easy to "follow" each message into and out of the queue. It assumes that the log lines will be presented in chronological order, and can deal with either tai64n or human-readable timestamps- it simply copies them to the output.

strack is the same kind of script, but it works on the qmail-smtpd log entries.

convert-multilog is a script which searches "/service/*/log/main" for any "@4*" files (the automatic cut-off files generated by multilog), converts their timestamps from tai64n to human-readable format, and writes them to /var/log/{service}.{date}. Once the lines from a given "@4*" file have been converted, the file is deleted.

I run this as a cron job on all of the servers I own or manage- some once an hour and some once every ten minutes, depending on how busy the server is.