http:// qmail.jms1.net / scripts / rules /

qmail Rules Interface

I have written a system which allows the users and domain owners on my server to control the filtering done by qmail-smtpd for mail which is addressed to them.

This filtering takes place during the SMTP envelope exchange, before the body of the message has been sent, so this system cannot be used to do content-based filtering. It can, and does, use the envelope sender and recipient, along with the IP address of the sending machine, to make a decision about whether to accept or reject a given message.

The system runs a processing script after each RCPT command in the SMTP conversation. This means that, if a server is sending a single message to multiple recipients on the same machine, and combines the transfer into a single SMTP conversation, it is quite possible for some of the recipients to allow the message, but others to reject it. In this case, the message will only be delivered to those recipients who allowed it.


System Overview

The system consists of three parts:

The system requires that you be running qmail with Jay Soffian's RCPTCHECK patch applied. This patch is part of my combined patch, version 7.07 and later. It may also be part of some of the other combined patches out there (I haven't looked at them in a while, so I don't know for sure which ones do and don't include it.)

The system also requires a running vpopmaild service, in order to validate logins and find each user's access level. This page explains how to set up a vpopmaild service under daemontools.

The web service's CGI scripts and the email processing script are all written in Perl. I tried to make the code as clear as I could, and have included enough comments that anybody who is at least halfway familiar with Perl should be able to read and understand them. There is also one simple shell script, used for setting the permissions on the files.


Other Pages about this system


The Web Interface

When you first log into the system, you will see the main screen.

Qmail Rules Interface - Main Menu
user@domain.xyz (Logged in) [LOGOUT]

Rules affecting email sent to user@domain.xyz

Seq Disposition Type Value Hits Commands Description
[1] System-wide rules, before all others
1 ACCEPT From IP 192.168.5.0/24 1,546 Internal network
[2] Domain-specific rules, before mailbox rules
1 ACCEPT From IP 204.27.210.5 178 Critical vendor's mail server
[3] Mailbox-specific rules [ADD]
1 REJECT From Sender @spammer.com 315 [ADD][DEL][EDIT][Down] Dude needs to stop already!
2 ACCEPT From Sender mom@aol.com 29 [ADD][DEL][EDIT][Up][Down] Mom
3 ACCEPT To Recipient ^user-alias@ 0 [ADD][DEL][EDIT][Up] Address with extension
[4] Domain-specific rules, after mailbox rules
1 ACCEPT All Messages (all) 517 Bypass RBLs, greylisting, etc.
[5] System-wide rules, after all others
1 ACCEPT AUTH client 86
2 REJECT RBL match 4.3.2.1.zen.spamhaus.org 130
3 REJECT RBL match 4.3.2.1.dnsbl.njabl.org 109
4 REJECT RBL match 4.3.2.1.dnsbl.sorbs.net 1
5 REJECT RBL match 4.3.2.1.bl.spamcop.net 1
6 DELAY Greylist 305 (5:05) 234
Note: the rules in smaller italic type at the end of the list will not be executed by the engine for this recipient, because they come after a "final" rule (either an all-messages rule, or a greylisting rule.)

This is a list of the rules which affect the incoming mail of whatever userid you logged in as, with command buttons allowing you to edit some of those rules. The page will will also show you any domain-wide and system-wide rules which will affect this user's incoming mail.

The "Hits" column tells you how many times each rule has matched a message within the current "focus". For example, the first rule in phase 1 might have matched 38,000 messages since being created, but only 1,546 of those were sent to user@domain.xyz. If you are logged in as a system administrator and set your focus to the system-wide rules, you will see the system-wide hit count for that rule. (The rules in phase 5 were added when the system was originally installed, before the domain owner added the all-messages rule in phase 4. Those counts show how this user's email was affected during that first week, before the all-messages rule was added.)

Phases

The rules are organized into "phases", which are essentially groups of rules. Within each phase, the rules have sequence numbers. When mail arrives, the processing script will evaluate each rule in order by phase and sequence. The first rule which matches the message will be followed, and the message will be accepted or rejected without processing any of the rules which follow it.

Each phase has a security level associated with it, which controls which types of users (i.e. mailbox owners, domain owners, or system owners) are able to edit the rules in that phase. The default phases, the ones I use on my own systems, are shown in the example above.

In the example, you can see that the machine owner is using several RBLs to filter their incoming mail, but any clients who send a valid AUTH command are still allowed to send mail without being subjected to the blacklist checks or greylisting. In addition, the domain administrator for this domain has created a rule which says "accept all" before the system-wide rules. This causes his domain's mail to not be filtered by RBLs, and not have any greylisting done to it, whether the client is AUTH'd or not.

Types of Rules

There are several different types of rules available. Each rule type specifies a type of test which is done against the sender's email, the recipient's email, or the IP address which is sending the message. If the message matches the rule, the message will be accepted or rejected without any further rules being processed.

The rule types are:

Working with Rules

The "Commands" column will contain a set of buttons you can click to do things with the rules. The following commands are available, although not every command will be available for every rule (for example, you can't move a rule "up" if it's the first rule in the phase.)

Button Description
[ADD] Add a new rule below this one. When this button appears on a phase header, it will add a new rule to the phase, before any other rules. You will be shown a form prompting you for the information needed to create the rule. When you complete this form and click the "Add Rule" button, the rule will be created and you will be returned to the main screen.
[DEL] Delete this rule. You will be shown a form with the rule's information. If you click the "Delete Rule" button, the rule will be deleted and you will be returned to the main screen.
[EDIT] Edit this rule. This will allow you to change the extra information for the rule (i.e. the IP address, email address, RBL zone name, or greylist timeout value), whether to accept or reject messages which match the rule, and the description of the rule.

Note that if you change anything other than the description of a rule, the rule's hit counter will be reset to zero.
[Up] Move this rule "up" in the list. This button will not be shown for a rule which is the first rule within a phase.
[Down] Move this rule "down" in the list. This button will not be shown for a rule which is the last rule within a phase.

Note that any changes you make will take effect immediately.

Focus

When using the system, you will only be editing the rules for one "entity" (one mailbox, a domain, or the system at large) at any given time. I refer to this as your "focus". When you first log into the system, your "focus" will be on your own per-user rules. You will be able to see, but not edit, the per-domain and system-wide rules.

If you are logged in as a user with the appropriate access, you will be able to change your focus and edit the rules pertaining to a different security level, using the "Change Focus" section which will be below the list of rules.

Note that when your focus is set to edit domain-level rules, you will not see any per-user rules. When your focus is set to edit system-level rules, you will not see any per-domain or per-user rules. This is because if you're working with the rules for an entire domain, there may be multiple users within the domain who have rules, and showing all of them would be confusing (at least I found it to be so while writing the program.)

The same applies for editing system-wide rules: you will not see any per-domain or per-user rules.

In the .htaccess file is a line which sets the SHOW_FOCUS_LIST environment variable. If you set this to "1", you will see a list of the entities which currently have at least one rule in the database. The list entries are clickable links which will set the focus directly to that entity.

At first I wasn't sure how I wanted to handle the focus selection interface, so I tried both ways at the same time. I kinda like the list idea, however I can see that if the server has more than a few dozen entities with rules, the list itself will overwhelm the page, so I added a configuration option so that the machine owner can choose whether or not they want to see it, by setting the value of the SHOW_FOCUS_LIST environment variable to a non-zero value. (For what it's worth, on my own server the list is turned on. Not all of my clients are using the interface yet, and it only shows the domains and mailboxes which have at least one rule in the database.)


History

The idea for this script started after I wrote a three-tuple greylisting system called jgreylist. I ran into a few problems on my own server, where I needed the greylisting to interact with other checks (i.e. certain IPs were trusted and should not invoke greylisting, one client liked the greylisting idea but didn't want to use any blacklists, one client didn't want any kind of filtering on their mail at all, one client wanted ONLY the spamhaus list and no others, one client who wanted filtering for everything except his company's domain and his wife's hotmail account, etc.) I found myself having to set up multiple IP addresses on the machine, so I could run separate qmail-smtp services for each client, each with different rules... It became a royal pain to manange.

When I thought about it, I realized if I did the RBL checks as part of a RCPTCHECK handler instead of using rblsmtpd, I would be able to selectively enable and disable certain checks, and run those checks before or after checking the sender and/or recipient addresses. I ended up writing a Perl script with functions to do the IP, RBL, and sender email address checks, along with a greylisting function, and a main() function which combined calls to these functions in order to implement the policies that each of my clients wanted.

That early script did work, and it did allow me to return to having just a single qmail-smptd service, but it was still rather tedious to maintain, especially when two of the clients started wanting changes almost every day (i.e. whitelisting new email addresses and domains, enabling and disabling specific RBLs, bypassing all of the filtering for the president's email, etc.)

The next idea was to write a web interface, so that my clients could edit the rules on their own without my having to spend several hours every week satisfying their every whim, especially when I knew they were probably going to change their minds a few days later. This meant finding a way to store the rules in a database, in such a way that each domain and/or mailbox had their own set of rules, some of which were shared (i.e. the rules for a mailbox should include the rules for the domain, and I wanted to be able to throw a few system-wide rules in there as well.)

It also meant re-writing the processing script to pull the correct rules out of the database, depending on the recipient address it was processing. My first instinct was to use regular expressions, however PostgreSQL's support for regular expression searching was rather convoluted at the time, and even at this early stage I realized that this would be something awesome to release to the world as open source software, and I wanted it to be able to work with other database engines (because not everybody uses PostgreSQL as their database of choice.)

A little bit of thinking about it made me realize that there were only three types of recipients for whom rules would need to be created: a single mailbox, a domain, and the entire machine. My first thought was to have separate tables for the mailbox, domain, and system rules, but when I started writing the schema, the fact that the tables' structures were identical bothered me - part of a good database design is combining similar data into the same tables, so I added a "rule type" field, which eventually became the "phase" field.

At first the script was doing this really complicated query, which was selecting all system rules, all domain rules where the "owner" was whatever was in the RECIPIENT variable, and then all mailbox rules where the "owner" was an exact match of the RECIPIENT variable. Then, while trying to figure out how to simplify this query, it also occurred to me that if the "owner" field had a regular expression in it, I could have PostgreSQL do a single search and find all three rule types, for any RECIPIENT value.

However, not every database engine can do regular expression searches in a WHERE clause (or if they can, they don't do it using the same syntax that PostgreSQL uses, and I didn't want to have to write different code for different database engines unless I absolutely had to.)

So then I figured out that if I stored the system-wide rules with owner='%' and the domain-wide rules with owner='%@domain', I could use SQL's "LIKE" operator "in reverse" and get all three rule types in a single query. The LIKE operator is a standard part of the SQL language, and it's fairly safe to assume that any other commonly used SQL engine is going to support it.

Back then the field was called "owner", I have since changed it to "recipient".

What do I mean by using the LIKE operator "in reverse"? Most people are used to writing queries like this:

SELECT * FROM tablename WHERE fieldname LIKE 'abc%'

However, I found that there was nothing in the SQL specifications which required that the argument (the pattern) had to be a literal string. A quick test showed that it works "the other way around" just as well, and that if I stored patterns in a field, I could search based on those patterns:

SELECT * FROM tablename WHERE 'abcdefg' LIKE fieldname

The result is that I can store a "LIKE" pattern as the recipient for each rule, and the processing script and web interface can both use the "LIKE" operator "in reverse" to locate the correct rules for that recipient. For example:

$ psql rules g4web Password for user g4web: Welcome to psql 8.1.23, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit rules=> SELECT phase , seq , recipient , type , sender , ip , rbl , target , delay , accept rules-> FROM rules WHERE 'user@domain.xyz' LIKE recipient ORDER BY phase , seq ; phase | seq | recipient | type | sender | ip | rbl | target | delay | accept -------+-----+-----------------+------+--------------+--------------------+------------------+--------------+-------+-------- 1 | 1 | % | I | | 192.168.5.0/24 | | | | t 2 | 1 | %@domain.xyz | I | | 204.27.210.5 | | | | t 3 | 1 | user@domain.xyz | E | @spammer.com | | | | | f 3 | 2 | user@domain.xyz | E | mom@aol.com | | | | | t 3 | 3 | user@domain.xyz | T | | | | ^user-alias@ | | t 4 | 1 | %@domain.xyz | A | | | | | | t 5 | 1 | % | R | | | zen.spamhaus.org | | | f 5 | 2 | % | R | | | dnsbl.njabl.org | | | f 5 | 3 | % | R | | | dnsbl.sorbs.net | | | f 5 | 4 | % | R | | | bl.spamcop.net | | | f 5 | 5 | % | G | | | | | 305 | t (11 rows) rules=> SELECT phase , seq , recipient , type , sender , ip , rbl , target , delay , accept rules-> FROM rules WHERE '@domain.xyz' LIKE recipient ORDER BY phase , seq ; phase | seq | recipient | type | sender | ip | rbl | target | delay | accept -------+-----+--------------+------+--------+--------------------+------------------+--------+-------+-------- 1 | 1 | % | I | | 192.168.5.0/24 | | | | t 2 | 1 | %@domain.xyz | I | | 204.27.210.5 | | | | t 4 | 1 | %@domain.xyz | A | | | | | | t 5 | 1 | % | R | | | zen.spamhaus.org | | | f 5 | 2 | % | R | | | dnsbl.njabl.org | | | f 5 | 3 | % | R | | | dnsbl.sorbs.net | | | f 5 | 4 | % | R | | | bl.spamcop.net | | | f 5 | 5 | % | G | | | | | 305 | t (8 rows) rules=> SELECT phase , seq , recipient , type , sender , ip , rbl , target , delay , accept rules-> FROM rules WHERE '@' LIKE recipient ORDER BY phase , seq ; phase | seq | recipient | type | sender | ip | rbl | target | delay | accept -------+-----+------------+------+--------+--------------------+------------------+--------+-------+-------- 1 | 1 | % | I | | 192.168.5.0/24 | | | | t 5 | 1 | % | R | | | zen.spamhaus.org | | | f 5 | 2 | % | R | | | dnsbl.njabl.org | | | f 5 | 3 | % | R | | | dnsbl.sorbs.net | | | f 5 | 4 | % | R | | | bl.spamcop.net | | | f 5 | 5 | % | G | | | | | 305 | t (6 rows) rules=> \q

As you can see, the first query picked up all of the rules for the mailbox, the domain within which the mailbox exists, and the system-wide rules, all in a single query. (The other two queries are shown as examples of how to look at just the "domain and system" rules, or just the "system" rules. This is how the index.cgi script builds the list when the focus is set to a domain or to the system.)

Once I figured out how to store the data in such a way that the database engine would select the rules for each message, I was able to re-use some of the functions from the original script, and write a new "main" function which queried the rules for the recipient, and then processed the rules from the database results.

Once the processing script was written and working, it ended up being very easy to convert the rules which had been encoded into that original Perl script into the equivalent database records. Once I did this, I switched over to the new script. The clients never knew the difference, and when they called up wanting rule changes, I was able to do them in a few minutes rather than an hour and a half.

After a few weeks of letting it run and watching the logs to sure it didn't break anything, I started working on the beginning of the web interface, but then I suddenly found a full-time job which took up all of my time (and ended up being so stressful that it contributed to a heart attack. True story.)

The web interface idea sat "on hold" for a few years, and then last month (2012-06) the year-long contracting job I had been on ended, and I had enough free time (and mental clarity) to finish the web interface. Plus, while working on the web interface and documentation, I added some things which I hadn't originally thought of, such as:

The system described here is the one I'm actually using on my own server. The tarballs I offer for download on this site are produced from the actual scripts in place on my live web server.


2012-08-20 Niamh Holding pointing out an interesting issue... If a recipient address has a "-" character in it, and qmail is also using "-" as the separator charcter for extension addresses, it can cause problems. I need to look into this.

In addition, I just noticed that when I wrote rcptcheck.pl, I have the "-" character hard-coded into the script. That needs to be fixed, but it won't happen today (I'm in the middle of packing up for a move.)