This page documents the structure of the database used by the Qmail Rules interface.
I went through several ideas for how to store the rules in a database, and finally settled on the schema (the set of database tables) described below. After I got it working on my own server (using PostgreSQL), I ended up having to change a few field types in order to make the schema work across all three of the database engines I'm testing with (i.e. PostgreSQL, MySQL, and SQLite.)
The phases table contains information defining the "phases", which are essentially groups of rules which are processed together, and which are administered by the same type of user (i.e. mailbox owner, domain administrator, or system administrator.)
Within each phase, the rules have specific sequence numbers (as described in the "rules" table, below.) When mail arrives, the processing script will evaluate each rule in order by phase and sequence. The first rule which matches the message will be followed, and the message will be accepted or rejected without processing any of the rules which follow it.
The phases table's structure looks like this:
|phase||SMALLINT||Phase number. Used to specify the order in which the phases are processed by the rcptcheck.pl script.|
|level||SMALLINT||Access level needed in order to edit rules within this phase.
Values are: 1=user, 2=domain administrator, 3=system administrator
|description||VARCHAR(80)||Human-readable description for this phase. Used as the phase name in the web interface.|
The default phases, the ones I use on my own systems, are contained in the lib/insert-standard-phases.sql file. They are:
|1||3 (System Admin)||System-wide rules, before all others|
|2||2 (Domain Admin)||Domain-specific rules, before mailbox rules|
|3||1 (User)||Mailbox-specific rules|
|4||2 (Domain Admin)||Domain-specific rules, after mailbox rules|
|5||3 (System Admin)||System-wide rules, after all others|
This set of phases allows a machine owner to set up system-wide rules which either CAN or CANNOT be overridden, by putting the rules into phase 1 (i.e. before anybody else's rules execute) or phase 5 (i.e. after everybody else's rules have executed, if none of those domains' or users' rules matched the message.) They also allow a domain owner to set up domain-wide rules which eithe CAN or CANNOT be overridden by their users, by putting the rules into phase 2 (before the individual mailbox owners' rules execute) or phase 4 (after the mailbox owners' rules execute, if none of their rules matched the message.)
I have found that these phases provide enough flexibility for my needs and the needs of my clients and users. However, if your needs are different, you can change the list of phases. The web interface does not offer any way to edit the phases, however you can make changes by either editing the insert-standard-phases.sql file before using it, or by runninging UPDATE, INSERT, and DELETE queries against the phases table in your database.
If you do edit the list of phases, you should make sure that the phase numbers start with 1, and that they are sequential (i.e. 1, 2, 3, etc.) without skipping any numbers. Changing the phase names after you have started using the system won't cause any problems, however changing the phase numbers or the access levels might cause confusion with any rules which already exist in the database. For this reason, I recommend that if you do plan to change the phase numbers or access levels, you should do it when you first install the system, before any rules are created... or you should be prepared to manually adjust the "phase" values for any rules which may already exist in the database.
The rules table contains the actual mail processing rules. Each rule contains information about whose email is affected by the rule, the order in which the rule should be processed, what kind of test is being done by the rule, and whether to accept or reject the message if the test matches the message.
The rules table's structure looks like this:
|id||INTEGER||Internal ID number. The web interface uses this value when updating the table (i.e. editing or deleting rules) but does not show it to the user at all. You may also see this value when looking at the logs.|
|phase||SMALLINT||The phase within which the rule appears. For database engines which enforce cross-table references, this field has a reference pointing to the "phase" field in the "phases" table.|
|seq||INTEGER||This rule's sequence within the phase.|
|recipient||VARCHAR(80)||The recipient whose email is affected by this rule. (See the text below for a more detailed description of what this field contains and how it is used.)|
|type||CHAR(1)||A code telling what kind of rule this is. (See the text below for a detailed list of the rule types.)|
|sender||VARCHAR(250)||For "Match sender email" rules, this field contains a regular expression against which the value of the SENDER envirionment variable (i.e. the argument of the SMTP "MAIL FROM" command) will be matched.|
|ip||CHAR(18)||For "Match IP address" rules, this field contains either a dotted quad IPv4 address (i.e. "22.214.171.124") or a CIDR block (i.e. "126.96.36.199/24".)|
|rbl||VARCHAR(250)||For "Match RBL" rules, this field contains the "zone" (the DNS suffix) to be added to the reversed IP in order to do the check.|
|delay||INTEGER||For "Greylist" rules, this field contains the number of seconds for which each new combination of (IP/Sender/Recipient) must wait before messages will be accepted.|
|target||VARCHAR(250)||For "Match recipient email" rules, this field contains a regular expression against which the value of the RECIPIENT environment variable (i.e. the argument of the SMTP "RCPT TO" command) will be matched.|
|accept||BOOLEAN||If the message matches the test specified in this rule, this field tells whether the message should be accepted (if TRUE) or rejected (if FALSE.) Note that in SQLite, this field is actually an INTEGER, and the values 1 (for TRUE) and 0 (for FALSE) are used instead.|
|description||VARCHAR(80)||Optional human-readable description for this rule. The web interface shows it on the list, but otherwise this field is not used at all.|
|created||TIMESTAMP||When this rule was added to the database. When the web interface changes a rule's target or disposition (i.e. accept or reject), this field is reset to the current time.|
This table's primary index is a combination of the phase, recipient, and seq fields.
The recipient value in each rule is an expression which can be matched using the SQL "LIKE" operator. The values look like this:
|Recipient Type||Example Value|
The type values and the extra data they require are:
|A||All messages||This rule matches every message. It can be used
to allow users to bypass rules which exist later in the processing
Rules which come after this rule will never be executed.
|D||Debug||This rule sets a flag which causes every rule affecting this recipient's mail to be individually logged. The order of this rule does not matter, as the script checks for the existence of a debug rule separately from querying the message-processing rules.|
|E||Match sender's email||sender||The SENDER value (the email address of the
sender) is compared with the regular expression in this field.
Some examples of regular expressions:
|G||Greylisting||delay||The first time a particular sender (identified by the
SENDER and TCPREMOTEIP values) tries to send mail to
this RECIPIENT, the server will temporarily refuse to accept
the message, but will accept the message (and future messages) after a
specific length of time, specified in this field.
Rules which come after this rule will never be executed.
|I||Match IP address||ip||The IP address of the machine which is sending the message to your server is compared to the IP address or CIDR block (i.e. "188.8.131.52/24") contained in the rule.|
|R||Match RBL||rbl||The IP address of the machine which is sending the message to your server is checked for membership on a DNS-based RBL (Realtime BlackList).|
|T||Match recipient's email||target||The RECIPIENT value (the email address of the recipient) is compared with the regular expression in this field. (Examples are above.)|
|U||AUTH client||The rule matches if the SMTP_AUTH_USER environment variable is non-empty. When using this system on a machine whose qmail has been patched with my combined patch version 7 or higher, qmail-smtpd will set this variable whenever the SMTP client sends a successful AUTH command (i.e. if the SMTP client is an authorized user of the machine.)|
The "greylist" table contains data used to implement greylisting. This data consists of the sender, recipient, and sending IP address, as well as timestamps showing the first and last times the entry was used, and counts of how many times messages were attempted before and after the greylisting period exipred.
The rcptcheck.pl processing script uses and updates this table as needed for greylisting. The web interface doesn't use it, other than allowing System Administrators to clean out greylist records which haven't been used in a while.
The greylist table's structure looks like this:
|id||INTEGER||Internal ID number. This is used when updating a record.|
|sender||VARCHAR(250)||The sender's email address.|
|recipient||VARCHAR(250)||The recipient's email address.|
|ip||CHAR(18)||The IP address which is sending the message to this server.|
|first||INTEGER||UNIX timestamp showing when the record was added, i.e. the first time this combination of (sender/recipient/ip) was seen.|
|last||INTEGER||UNIX timestamp showing the last time the record was updated, i.e. the most recent time this combination of (sender/recipient/ip) was seen.|
|count_pre||INTEGER||How many times this combination of (sender/recipient/ip) was told to wait before messages would be accepted.|
|count_msg||INTEGER||How many times this combination of (sender/recipient/ip) was allowed to send a message after the greylist timer expired.|
This table's primary key is a combination of the sender, recipient, and ip fields.
Web users with System Administrator access can use the cleanup.cgi script (they will see a link at the bottom of the list, below the focus form/list) to clean up any greylist records which haven't been used in a certain number of days. (This function keys off of the last field.)
The "log" table contains a log of every message which was processed by the rcptcheck.pl script, which rule (if any) ended up matching the message, and the disposition (i.e. whether the script told qmail-smtpd to accept, reject, or delay the message.)
The rcptcheck.pl processing script adds records to this table. The web interface reads from it, to count how many times each rule has been matched by incoming messages.
The log table's structure looks like this:
|id||INTEGER||Internal ID number. This is a sequence number which is automatically added to each record, but isn't really used for anything (yet?)|
|ruleid||INTEGER||The "id" value of the rule which matched the message. If the message did not match any rule (i.e. there were no matching rules in the database, or the message was a bounce) this value will be 0.|
|dt||TIMESTAMP||The date and time that the message was processed.|
|sender||VARCHAR(250)||The message sender.|
|recipient||VARCHAR(250)||The message recipient.|
|ip||CHAR(15)||The IP address which sent the message to this server.|
|disp||VARCHAR(20)||A quick description of whether the message was accepted or rejected, and what type of rule accepted or rejected the message.|
The "sessions" table contains a list of the web sessions. Each session is identified by a cookie (a collection of random-looking characters - I'm actually using the MD5 hash of a collection of data which should be unique to each request), and contains the browser's IP, the userid and access level, the first time the session was accessed, when the session expires, a count of how many web requests happened in the session, and a flag disabling the session (which is set if the user manually logs out.) The web interface uses this table, the mail processing script does not have any connection to it.
If the browser doesn't access the system for a certain length of time (30 minutes by default), the session will "time out" and the user will be forced to re-enter their userid and password. The time limit can be changed by the machine owner (by changing the SESSION_TIMEOUT environment variable in the .htaccess file.)
The sessions table's structure looks like this:
|id||CHAR(22)||Session ID. This is generated for each new session, and is the Base64-encoded MD5 checksum of a collection of session-specific data (i.e. a timestamp, the remote IP, the PID of the login.cgi process, and a sequence number to prevent duplicates.)|
|ip||CHAR(15)||The remote IP address (i.e. the browser's IP.) Requests which arrive with this session ID, but from the wrong IP address, will cause the session to be immediately invalidated.|
|userid||VARCHAR(80)||The user ID for this session.|
|level||SMALLINT||The access level granted to this session. 1=User, 2=Domain administrator, 3=System administrator.|
|first||INTEGER||UNIX time when the session started (i.e. when the user successfully logged into the system.)|
|expires||INTEGER||UNIX time when the session expires. Any attempt to use the session key after this time will be ignored. This field is updated to (NOW + SESSION_TIMEOUT) for every request.|
|hits||INTEGER||Count of how many hits (executions of a "*.cgi" script) occurred in this session. This field is incremented for every request.|
|ended||BOOLEAN||This flag defaults to FALSE, and is set to TRUE if the user manually logs out, or if the session is invalidated because a request came from the wrong IP address. (Under SQLite, this field is an INTEGER, and the values 0 and 1 are used instead of FALSE and TRUE.)|
The web interface adds a record to this table whenever a user logs in, and updates the session's record for every request which arrives within that session. Users with System administrator access can use the cleanup.cgi script (a link will be at the very bottom of the list, below the focus form/list) to remove any old sessions from the database.