One of the most common spammer tricks is to use an "email blaster" program, which sends as many messages as possible in as short a time as possible. This is commonly seen with "zombies", machines which have been infected with a virus that allows a spammer to remotely control it, and use it to send out spam.
These "blaster" programs will usually try to deliver their message once, and if it doesn't work, they give up. This is one difference between a spammer and a legitimate mail server- a legitimate server will try to deliver the message again after a short period of time, while a "blaster" program usually doesn't try again.
The idea of a "greylisting" is that the server will give the client a "soft error" message for a short period of time, and after that time it will accept mail from that client normally. The trick is to find the right time limit, long enough to discourage the more agressive spammers, but short enough to not inconvenience your users.
When a server receives an incoming connection from a client, it checks the client's IP address against a list. Depending on what it finds...
If the IP address has never been seen before, a record is created for the IP address and the client is given the "soft error" message, which tells it that the message will not be accepted right now, but the client should try again later.
If the IP address was first seen very recently (usually within the past three to five minutes), the client will be given the same "soft error" message and no mail will be accepted.
Otherwise, the message will be accepted normally.
The other consideration is that the database of when each IP address was first seen can eventually grow large enough to fill up the storage space available on the system. In order to prevent this from happening, a second timer is kept- one which is updated every time the client connects. Every so often the server will "clean" the database by deleting all record of any IP which has not been seen in a long time (usually 30 days or more.)
When I decided to try greylisting on my own server, I looked at several other greylisting implementations, but I wasn't happy with any of them, primarily because of how they stored the information about which IP address had been seen and when. I saw everything from flat text files to a SQL database.
The best one I found was called qgreylist. It stores the information as empty files, with the various times (access time, modification time, etc.) used to hold the first and most recent time an IP was seen.
It also has an option to store greylist entries for class-C networks, rather than for individual IP address. This keeps the number of files down to reasonable limit, although it means that after an IP address connects, after the 3-5 minute limit, every IP address in that class-C block will be allowed to connect as well. This is actually not as much of a problem as it sounds like.
I decided to write my own script because there are two problems with this script:
It stores all of the IP tracking files in a single directory, which can really slow down the process of finding any one individual file if there are several thousand files there. When you're handling a large volume of mail, anything which slows the process down without a reason is a bad thing... and I'm used to designing servers for ISPs, so it's second nature for me to try and make things run faster.
It does the "cleanup" function as part of a random SMTP connection every so often. Given my own experience with over 60,000 greylist entries, I've seen the cleanup job take anywhere from five seconds to a minute and a half. There is no reason to make an SMTP connection wait that long, especially if it's one you're not going to block.
My greylist implementation also uses the timers of empty files to store the timestamps for when an IP address was first and most recently seen, and it stores them by class-C block (although it can be made to store individual files for each IP address, if you would rather do it that way, and if you have a filesystem with enough inodes to support it.) However, it stores the blocks in directories based on the octets of the IP address.
For example, the information for IP address 127.10.20.30 is stored in the file 127/010/020 on the disk. This is a directory called 127, which contains a directory called 010, which contains a file called 020, whose "mtime" field holds the first time the IP address (or block) was seen, and whose "atime" field holds the most recent time the IP address (or block) was seen.
2006-08-28 Thanks to "Jagular", I found a bug in the first version of the program- if a client connected, is being greylisted or denied, and doesn't issue a proper QUIT command, the log was showing "GREY first time", "GREY too soon", and "OK known", one after the other, even though they were not OK yet. This has been corrected.
I've also added the ability to reject connections from IP addresses which have no reverse DNS. This check is enabled by default, if you want to disable it you should edit the script and change the value of the $block_norev variable.
Note that if you wish to use this option, you must NOT run tcpserver with the -H option. Normally, tcpserver does a reverse-DNS check and stores the name in an environment variable called TCPREMOTEHOST. The code in jgreylist uses this variable to check the name. If you run tcpserver with the -H option, and you turn on this option in jgreylist, your jgreylist will think that EVERY IP IN THE WORLD has no reverse DNS, and will therefore reject every incoming connection. Don't do this.
The download links are at the bottom of the page.
The phrase "greylist user" means the user that your SMTP service runs as. This user will need to own the directory containing the greylist files, so that it will have permission to create files and directories as needed.
You will need to know the userid, as well as their primary login group (the group assigned to them in their /etc/passwd entry.
If you are using my "run" script for SMTP services, you will find the userid in the QUSER variable.
# userid that qmail-smtpd should run as
QUSER=qmaild
Once you have the user, you can find the group using the groups command:
Assuming the default QUSER=qmaild
# groups qmaild
qmaild : nofiles
If not, examine the tcpserver command line in your SMTP service "run" script, you will see -u and -g parameters. This userid and group are what you need.
The jgreylist script should be installed in a directory listed in the PATH of your SMTP service's "run" script. Normally the /var/qmail/bin directory is part of the PATH, so this is probably the easiest thing. It should be owned by root, have the same group ID as the greylist user's group ID, and have permissions 0750.
The jgreylist-clean script can be installed anywhere on the system. On my system it's installed in /usr/local/sbin. It should be owned by root and have permissions 0755.
The next decision is where you want the greylist directory to be. Wherever you create the directory, it needs to be somewhere that the greylist user has access. The default location, where I have it on my server, is /var/qmail/jgreylist. If you choose some other location, you will need to edit the my $greydir= line at the top of both scripts to point to your chosen location.
You need to create the directory you have chosen. This example assumes that your greylist user is "qmaild", with group "nofiles", and we will use the default location.
Assuming the defaut user, group, and directory
# cd /var/qmail
# mkdir -m 0700 jgreylist
# chown qmaild:nofiles jgreylist
The next step is to insert the jgreylist script into the command line which runs qmail-smtpd. It needs to be after tcpserver and before qmail-smtpd.
If you are using my "run" script for SMTP services, the script will already know how to put the greylisting program into the right spot in the command line. Simply put the full pathname of the jgreylist script (or just the filename, if the script is installed in a directory listed in the PATH) into the GREYLIST variable, make sure the line is not commented (i.e. does not start with "#") and restart the service using svc -t.
Original script
# greylisting program
#GREYLIST="jgreylist"
Modified script
# greylisting program
GREYLIST="/var/qmail/bin/jgreylist"
If not, you will need to manually add the script to the command line. This is a quick example, using the "run" script that comes with qmailrocks v2.2.0:
Original script
...
exec /usr/local/bin/softlimit -m 30000000 \
/usr/local/bin/tcpserver -v -R -l "$LOCAL" -x /etc/tcp.smtp.cdb -c "$MAXSMTPD" \
-u "$QMAILDUID" -g "$NOFILESGID" 0 smtp \
/var/qmail/bin/qmail-smtpd mail.example.com \
/home/vpopmail/bin/vchkpw /usr/bin/true 2>&1
Modified script
...
exec /usr/local/bin/softlimit -m 30000000 \
/usr/local/bin/tcpserver -v -R -l "$LOCAL" -x /etc/tcp.smtp.cdb -c "$MAXSMTPD" \
-u "$QMAILDUID" -g "$NOFILESGID" 0 smtp \
/var/qmail/bin/jgreylist \
/var/qmail/bin/qmail-smtpd mail.example.com \
/home/vpopmail/bin/vchkpw /usr/bin/true 2>&1
Note that if you are using jgreylist in conjunction with rblsmtpd, they can be entered in either order on the command line. However, it makes more sense to me to run rblsmtpd first. If a client's IP address is listed on a blacklist, you know you're going to hang up on them anyway- why waste the extra 30 seconds on a spammer?
Once you have added the greylisting script to the command line, you should restart your SMTP service so that the changes take effect.
# svc -t /service/qmail-smtpd
You can set certain IP addresses as "whitelisted", meaning that they will never be delayed on their way to qmail-smtpd, or "blacklisted", meaning that they will always get a fake SMTP conversation and never be allowed to send mail, using the JGREYLIST environment variable.
It works in the same manner that the RBLSMTPD variable works for the rblsmtpd program:
If the variable exists and is empty, the connection is passed to the next program in line without any further checking.
If the variable exists and is NOT empty, the connection will always be diverted into the fake SMTP session, with the contents of the variable used as the error message sent to the client whenever they try to send a message.
Otherwise, the greylisting mechanism works normally.
Here's an example of what a tcpserver control file might look like with this variable in it:
x.x.x.x:allow
Normal entry - client will be allowed to connect, but will be
greylisted the first time they connect.
x.x.x.x:allow,JGREYLIST=""
Client will be allowed to connect, and jgreylist will pass the
connection along without intercepting it.
x.x.x.x:allow,JGREYLIST="We don't want your spam."
Client will be allowed to connect, but jgreylist will always "trap"
the client in the phony SMTP session.
The early versions of this script supported whitelisting and blacklisting using cdb files. This is no longer supported- if you are using an old version of the script, you should upgrade to the current version.
The jgreylist-clean script should be run as the greylist user, or as root. Either way, it can be run at any time. My own server runs it once a day, in the early morning. The only advantage of running it more often is that it will drop old entries a little bit sooner.
The only restriction is that if you run two copies of it at once, they will confuse each other- it won't cause damage, but it will cause one or both to stop before finishing its job.
At the top of the script is a configuration section. It contains the following variables:
my $greydir = "/var/qmail/jgreylist" ;
This variable contains the location of the greylist directory. It must
be identical to the $greydir variable in your jgreylist
script.
my $max_age = 30 * 24 * 60 * 60 ;
This variable controls how long an IP can go without connecting before
it is deleted, in seconds. The default, as you can see, is 30 days.
my $one_age = 24 * 60 * 60;
If a given IP address only connects one time (i.e. it's been told to try
again later but never does), it can be deleted sooner if you like. The
default for this limit is 24 hours. If you don't want to do this check (i.e.
if an IP which only connects once should be treated just like any other IP)
then set this to 0. (Thanks to Ron Miller for the suggestion.)
Everybody seems to be asking variations of this question over and over... if my users are traveling and may be sending mail from an IP address which the server hasn't seen before, how can I allow my users to use the AUTH command to bypass the greylisting delay?
The answer is very simple... YOU CAN'T.
If an SMTP service is using greylisting, the decision of whether to allow the connection immediately or hold it until later must be made before qmail-smtpd is ever executed... and since qmail-smtpd is the program which would handle any AUTH command, there is no way for the jgreylist program to know whether or not a given client will send an AUTH command.
The proper solution to the problem is to create a second SMTP service for your users- one which not only allows AUTH, but requires AUTH before it will accept any incoming mail. If you're using qmail with my combined patch, version 6 or later, you can use the REQUIRE_AUTH environment variable to make this happen. Of course, if you're accepting AUTH, you should also NOT accept the AUTH command over a non-encrypted connection.
Other pages on this site explain how to set up an SMTP service and what the options are for specific combinations of encryption, AUTH, and accepting mail.
The next question is whether or not jgreylist could somehow handle AUTH commands, and if it sees one, somehow pass the connection mid-stream to qmail-smtpd. The answer to this question is also no.
The only possible way to do this would be to write the greylisting logic into qmail-smtpd itself... and even I did think it was a good idea, it's not a project that I have time to play with.
Unless you know of some magical way to make this happen, please stop emailing me and asking about it. (And no, that doesn't mean to email me with guesses. If you have an idea and want to see if it works or not, build a test server and try it yourself. Please... only email me about this if you HAVE A WORKING SOLUTION.)
2007-07-29 One of the problems with this script is that it's written in Perl, and it has to be "compiled" every time it runs. For most servers this is not an issue, but if your server is handling tens of thousands of incoming connections every hour, the overhead of having to run a Perl script can slow the server down noticeably.
I have written, and am using on my own server, a C version of the program. This needs to be compiled, and the resulting binary should be installed in the same place that the Perl script would be installed.
Instead of configuring the C version by editing the source code, the configuration is done using environment variables which can, and in one case must, be specified before the program runs- which means either in the "run" script for the SMTP service, or as an add-on variable in your tcpserver access control file.
This is a list of the environment variables that the program looks for, and how they are used:
If this variable is present and empty, the connection will be immediately passed along to the next program in line. If it's present and not empty, the connection will be denied, and the value of this variable will be used as the error message for that client. If it's not present, the normal checks will be done. This is exactly the same way the Perl version works, and it's very similar to how the rblsmtpd program uses the RBLSMTPD variable.
REQUIRED. This variable must contain the full path to the "greylist directory", where the files are stored. This corresponds to the $greydir variable in the Perl version.
If this variable is set to a non-empty value, connections from IP addresses without reverse DNS names will be rejected. This can be done here, but it usually makes more sense (and saves CPU cycles) to have tcpserver do it. This corresponds to the $block_norev variable in the Perl version.
Normally, the program tracks client addresses by the class-c block from which the client originated (i.e. the "first three numbers" of the IP address.) If you set this variable to a non-zero value, it will track the clients by their full IP address. I don't recommend it, as it can make the filesystem run out of inodes, however the option is there if you need it. This corresponds to the $list_c variable in the Perl version.
This is the number of seconds a client must wait after their first connection, before they can connect again and actually send mail. If this variable is not set, the default is 120 (two minutes.) This corresponds to the $time_grey variable in the Perl version.
If this variable contains a non-zero number, the disposition of each connection (i.e. "allowed", "denied", or "greylisted") will be logged. This option is turned on by default, if you want to turn it off you will need to explicitly set the value to 0. (There is no corresponding variable in the Perl version, the only way to disable these log entries is to find and comment out the appropriate lines in the code.)
If this variable contains a non-zero number, every log entry generated by the program will include the PID (process ID). This is generally a good idea. This option is turned on by default, if you want to turn it off you will need to explicitly set the value to 0. This corresponds to the $log_pid variable in the Perl version.
If this variable contains a non-zero number, the actual commands and responses from the fake SMTP conversaion will be sent to the log. This option is not turned on by default, because it can make your log files grow very large, very quickly. This corresponds to the $show_log variable in the Perl version.
This variable contains the maximum number of seconds each connection is allowed to waste in the fake SMTP conversation before the program forcibly hangs up on them. The program imposes a range of 5 to 300 seconds, with a default of 60. This corresponds to the $fake_max variable in the Perl version.
This variable contains the maximum number of RCPT commands that a client can send before the program will forcibly hang up on them. If this value is missing or zero, the RCPT commands will not be counted and there will be no forced hang-ups. The default is 0. This correpsonds to the $max_rcpt variable.
In addition, the program makes use of the TCPREMOTEIP and TCPREMOTEHOST variables. These variables are normally set by tcpserver, however if you're testing the program you will need to put values into them. TCPREMOTEIP contains the IP address of the client, and (unless you're using tcpserver's "-H" option) TCPREMOTEHOST will contain the reverse-DNS name which corresponds to that IP address.
Note that the reverse DNS check triggered by the JGREYLIST_NOREV variable relies on the TCPREMOTEHOST variable having been set by tcpserver- it does not do any DNS queries by itself.
2007-07-30 Found and fixed a few bugs... the original version DID work, but it didn't work exactly like the Perl version- it built the filenames differently, and under some conditions (sometimes but not always, no idea what caused it) the output sent to the client wasn't flushing correctly.
The updated version now uses the same filename scheme as the Perl version. If you tried a C version prior to v6, you will probably want to clean up your JGREYLIST_DIR with these commands (as root), after replacing the compiled binary with the new version (v6 or newer):
# cd /var/qmail/jgreylist
or whatever your JGREYLIST_DIR is set to
# find . -name "*.*" -exec rm -v {} \;
# find . -name "?" -exec rm -rv {} \;
# find . -name "??" -exec rm -rv {} \;
Thanks to Patrick "marlowe" McDonald and Egor Fisher for testing new versions, and for being patient while I get it nailed down.
Each successive version of the program identifies itself in the banner which is sent to clients who get a fake SMTP session. The first three versions identified itselves as just "jgreylist", as does the Perl version.
2007-08-19 I have received reports that the C version appears to be using a lot more CPU than the Perl version. The issue (and I say "issue", not "problem") has to do with how I wrote the program, and the fact that many systems' accounting routines consider the usleep() function to be "busy" time, rather than "sleeping" time. And it may be, I don't know- I didn't dig into the Linux kernel source code to find out.
I'm about 98% sure the problem has been resolved with version 7. If you're not interested in the technical explanation, skip down to the Download links below.
The problem has to do with how the read() function reacts when it receives signals. Normally, if no data has been received since read() started, it starts over after being interrupted by a signal. However, in this program that means that the SIGALRM signal, which the kernel sends JGREYLIST_TIMEOUT seconds after the beginning of a fake SMTP conversation, will only cause read() to return if the client happens to have sent a byte of data at the same time- otherwise it doesn't return until the client sends their next byte... and if the client is an attacker holding the connection open, this allows them to hold it open indefinitely.
It turns out that there is a way to force read() return an error code whenever it's interrupted by a signal. I was not aware of this until earlier today, but after writing a few little test-bed programs, I have added the necessary function calls to set this up.
What this means is that instead of setting the socket to non-blocking mode, checking it ten times a second, and calling usleep() over and over, it can now just call read(), let it block normally (which the kernel does not considers "busy" time), and know that when the SIGALRM signal arrives, the fake SMTP conversation will end immediately.
For those who are REALLY interested, I added the following lines to make read() return when interrupted by a signal:
At the top of the file, you need:
#include <signal.h>
At the top of the function, add this declaration:
struct sigaction sa ;
AFTER setting up the signal handler, add the following:
signal ( SIGALRM , handle_alarm ) ;
alarm ( JGREYLIST_TIMEOUT ) ;
sigaction ( SIGALRM , 0 , &sa ) ;
sa.sa_flags &= ~SA_RESTART ;
sigaction ( SIGALRM , &sa , 0 ) ;
2007-08-21 I had a suggestion from Egor Fisher, that jgreylist could forcibly hang up on clients (such as spammers) who send too many RCPT commands, like what the VALIDRCPTTO_LIMIT variable does for the validrcptto.cdb patch. Version 8 includes this functionality.
|
|
||||||||||||||||||||||||
|