All the cool web sites are posting privacy policies, and somebody asked me if I had one, so I figured I should write one.
Privacy is a very important issue to me. I know that I'm very hesitant about providing any kind of personally identifiable information when I visit web sites, and by the same token I'm being very careful to NOT collect any more information than I need to when people visit my sites.
To sum it all up in one sentence, I don't WANT your private information beyond what I might need for troubleshooting my own server, and I am doing everything I can to prevent my server from collecting any more than it needs to.
When you visit web pages on my server, the web server automatically keeps log files which contain the following:
The time and date of your visit.
What pages, graphics, and other entities you request from the server.
Your IP address, in this case "18.104.22.168".
The "agent" string, which usually identifies the browser you are using. In this case, "CCBot/2.0 (http://commoncrawl.org/faq/)".
If you are visiting a password-protected part of a site, the userid with which you access (or try to access) the page(s).
If your visit triggers any error condition, one or more error messages detailing what went wrong.
As you can see, it's fairly standard stuff. These are the same data which are collected by pretty much every web server on the internet.
When qmail handles a message on my server, the following information is logged:
For INCOMING messages (received from a local process, or from the outside world via SMTP)
For OUTGOING messages (delivered to a local mailbox, or to another SMTP server)
If a message is delivered to a local mailbox (i.e. if my server is hosting the mailbox) then the message contents are physically stored in that mailbox until the mailbox owner deletes it.
This machine is an authoritative DNS server for several domains, and also serves an RBL (realtime blacklist) containing entries for IPs from which I do not wish to receive mail. As such, it receives and answers requests from the outside world regarding those domains. The following information is logged for each query:
This machine is running an XMPP, or "Jabber", server. This is a form of IM (instant messaging) which does not rely on any one central server, such as "AOL Messenger", "MSN Messenger", or "Yahoo Chat". The XMPP protocol is used as the underlying protocol of "Google Chat", which means that Gmail users are able to chat with other Jabber users.
At a former job, I watched somebody (a sheriff's deputy in uniform, standing in my office) call the operator of one of the largest IM networks in the world, and ask for a list of everybody who a particular user had been chatting with for the previous three months. Twenty minutes later, this company emailed him SIX months' worth of FULL TRANSCRIPTS of the conversations. There was no court order, subpoena, or even a FAX on department letterhead... just a voice on the phone who claimed to be "Deputy so-and-so from the XYZ Sheriff's department."
The jabber services on this machine only log the following:
When the services start, stop, connect, and disconnect with each other internally. (The "jabber server" is actually several different processes working together.)
When any "s2s" (server to server) connections start or end, and whether or not the connections are encrypted. These connections are used to pass messages from my jabber server to other jabber servers.
When any "c2s" (client to server) connections start or end, and whether or not the connections are encrypted. These connections are made from the clients on the users' machines to my server.
To say it very plainly, no message contents are ever logged. Neither is the message routing information (i.e. "John sent a message to Frank"), however by correlating the times of the c2s and s2s connections, it may be possible to figure out that one or more messages were passed between two or more users (i.e. a traffic analysis.)
The closest thing to logging messages is the PostgreSQL database, which contains each user's "buddy list" and authentication information, as well as any undelivered messages. If a user sends a message to somebody who isn't connected, or receives a message while they are not connected, the messages are held in the database until they can be delivered to the recipient. Once the message is delivered, it is deleted from the database.
The web server's access logs are kept for a little over a month, in order to produce monthly statistics on how much traffic the site is getting, and deleted automatically. I don't want to keep them, both for privacy reasons, and because they take up space.
The web server's error logs are not automatically deleted, however I manually trim them when they get too big.
Other log files, including the logs from my DNS and mail servers, are rotated on a daily basis, and any log files older than seven days are automatically deleted, again for both privacy and space reasons.
Cookies are small pieces of text which a web server hands to a browser, and which a browser sends back to that web server whenever requesting a new page. Cookies are used to overcome a limitation of the HTTP protocol, the lack of support for persistent sessions. Web servers handle requests from multiple clients at the same time, and each request is logically separate from any other request. By using cookies, a server-side application can keep different clients separate from each other, and can maintain a "state" for each client. This makes things like shopping carts possible.
The cookie mechanism can be abused, however. If a page includes content (such as banner ads) from a third party, and that third party includes cookies with their responses (as all banner-ad companies do), your browser will normally keep their cookies as well- and if you visit some other page on a different site, which happens to include content from the same third party, that first cookie goes back with the new request- which means this third party is able to track your visits to BOTH web sites under the same identifier. This behaviour is exactly why DoubleClick was sued back in January 2000- a case which they ultimately settled after two and a half years.
The qmail site, and parts of the main www.jms1.net site, allow you to choose a colour scheme for the site. At the moment the choices are the default "black background" or a boring "white background", however the mechanism can be extended if I ever decide to write more stylesheets. When you choose a colour scheme, the server sends a cookie to your browser which looks like "stylesheet=black" or "stylesheet=white", and the entire mechanism is explained on this web page. As you can see, there is no personally identifiable information in the cookie- just your preference for the background colour.
The cookie itself is shared with other sites ending with ".jms1.net", so if you like the white background on one site, all of my sites (which are written to use the selectable background mechanism) will respect your preference. The cookie itself is sent with a one-year expiration date.
The jabberd2 index page on the www.jms1.net site uses AJAX to have the server calculate the values for the SRV records. The relevant portion of the page makes it very clear how to view the script (i.e. view the page source) and offers a link to the code running on the server.
My recommendation, and what I do myself, is to use Firefox with the NoScript and Adblock or Adblock Plus plug-ins, and configure Firefox to either not accept third-party cookies (i.e. cookies from Google when you're actually visiting my web site) or to delete all cookies when Firefox exits (which is what I do.)
I make no secret of the fact that I use Google ads on my site. I do it because it's relatively non-intrusive (to the page) and because when people click on the ads (not just look at them), Google sends me money. However, Google's ad system can be considered somewhat intrusive from a privacy standpoint, so you may wish to either block the ads, or at least block the cookies involved with the ads. Please see below for a better explanation.
I have been participating in Google's AdSense program for the past few years, allowing Google to place banner ads at the bottom of most of the pages on the site.
Google has notified me that on 2009-04-08, they will start using what they call "interest-based advertising." I'm supposed to refer people to Google's Advertising and Privacy page for information about Google's policies, so I've provided the link.
With that out of the way, here's what I think is actually happening. You may remember that Google bought out DoubleClick a while back. It looks like Google is going to start using DoubleClick's system to select the ads it shows, rather than using their current system, which chooses ads based on the content of the page in which the ad appears. This also means that any Google ads served will be adding to that database of what people are interested in- basically, anybody who sees a Google ad served by my web site will be adding "qmail" to the list of things DoubleClick knows they're interested in.
I'm not comfortable helping Google, or anybody, build a database like this- especially since its primary use is to show advertising, which is something I don't particularly like to begin with.
So I'm going to make a few recommendations about how you can protect yourself from what I personally see as an invasive system.
Visit http://optout.doubleclick.net/cgi-bin/optoutgoogle.pl to reset your DART cookie's value to "OPT_OUT". This supposedly keeps DoubleClick (and now Google) from collecting information about your visit, assigning you a different DART cookie, or otherwise tracking your online activities. It will not, however, prevent you from seeing the ads themselves- it just means that the ads you see will not be targeted to an interest profile.
Use Firefox, and configure it to not accept third-party cookies (which means if you're looking at a page on "qmail.jms1.net", Firefox will ignore cookies sent by other servers.) If you already have such a cookie, Firefox will still send it with requests to the domain contained in the cookie, so when you turn on this setting, make sure to delete all of your existing cookies (or at least the ones you're not comfortable with.) This will not prevent you from seeing the ads, but it will prevent Firefox from storing the cookies which the ad vendors try to send you, which means they won't be able to track you as easily.
You may also want to configure Firefox to remove all cookies when it exits. Again, this will not prevent you from seeing the ads, but it will prevent DoubleClick from "following" you from one web page to another.
Use the Adblock or Adblock Plus plug-ins for Firefox. Adblock allows you to create a list of URL fragments which, if found in a link, will not be retrieved. Adblock Plus adds the ability to use a pre-defined list of blocked items from one of several servers around the world. If you use the normal Adblock, make sure to block the "doubleclick.net" and "googlesyndication.com" domains, which are used by the old DoubleClick and the current Google ad-serving systems.
Privoxy is a non-caching web proxy which removes most banner ads and tracking cookies. It's meant for use with browsers which don't have the same level of customization that Firefox does. This is what I used to use before Firefox added the option to delete cookies when Firefox exits.
I'm not entirely comfortable with the idea of continuing to host Adsense ads, if they're going to tie into the DoubleClick cookies. I've been blocking both the ads and the cookies from my own browsers for several years, however others may not mind being tracked like this- and if you don't mind being tracked (and occasionally clicking an ad or two) then I don't mind if Google wants to pay me for your clicks. The point is that it's an INFORMED decision on your part.
If an email is sent to one of my honeypot addresses, a full copy of the message is retained as evidence in case I need to research a message which may not be spam. These evidence files are kept for a minimum of one year.
Email messages sent to a legitimate email address on this server may be retained within that mailbox indefinitely, at the discretion of the owner of that mailbox. Personally speaking, I tend to be a bit of a pack-rat, mostly because I'm not sure if I'll need to refer back to something in the future, and because I don't really have the time to go through all of my stored messages to find and delete the things I don't really need. However, I'm getting a little bit better about deleting current things which I don't really need to keep...
If you're going to send anything private or sensitive, or even if not, I highly recommend you use encryption, such as the free GnuPG (or GPG4Win, for Windows users) or the commercial PGP products. My public key is available here.
If you have any questions about this policy, please contact me at the email address listed below.