All the cool web sites are posting privacy policies, and somebody asked me if I had one, so I figured I should write one.
Privacy is a very important issue to me. I know that I'm very hesitant about providing any kind of personally identifiable information when I visit web sites, and by the same token I'm being very careful to NOT collect any more information than I need to when people visit my sites.
To sum it all up in one sentence, I don't WANT your private information beyond what I might need for troubleshooting my own server, and I am doing everything I can to prevent my server from collecting any more than it needs to.
When you visit web pages on my server, the web server automatically keeps log files which contain the following:
The time and date of your visit.
What pages, graphics, and other entities you request from the server.
Your IP address, in this case "35.173.48.18".
The "agent" string, which usually identifies the browser you are using. In this case, "CCBot/2.0 (https://commoncrawl.org/faq/)".
If you are visiting a password-protected part of a site, the userid with which you access (or try to access) the page(s).
If your visit triggers any error condition, one or more error messages detailing what went wrong.
As you can see, it's fairly standard stuff. These are the same data which are collected by pretty much every web server on the internet.
When qmail handles a message on my server, the following information is logged:
For INCOMING messages (received from a local process, or from the outside world via SMTP)
For OUTGOING messages (delivered to a local mailbox, or to another SMTP server)
If a message is delivered to a local mailbox (i.e. if my server is hosting the mailbox) then the message contents are physically stored in that mailbox until the mailbox owner deletes it.
This machine is an authoritative DNS server for several domains, and also serves an RBL (realtime blacklist) containing entries for IPs from which I do not wish to receive mail. As such, it receives and answers requests from the outside world regarding those domains. The following information is logged for each query:
This machine is running an XMPP, or "Jabber", server. This is a form of IM (instant messaging) which does not rely on any one central server, such as "AOL Messenger", "MSN Messenger", or "Yahoo Chat". The XMPP protocol is used as the underlying protocol of "Google Chat", which means that Gmail users are able to chat with other Jabber users.
At a former job, I watched somebody (a sheriff's deputy in uniform, standing in my office) call the operator of one of the largest IM networks in the world, and ask for a list of everybody who a particular user had been chatting with for the previous three months. Twenty minutes later, this company emailed him SIX months' worth of FULL TRANSCRIPTS of the conversations. There was no court order, subpoena, or even a FAX on department letterhead... just a voice on the phone who claimed to be "Deputy so-and-so from the XYZ Sheriff's department."
The jabber services on this machine only log the following:
When the services start, stop, connect, and disconnect with each other internally. (The "jabber server" is actually several different processes working together.)
When any "s2s" (server to server) connections start or end, and whether or not the connections are encrypted. These connections are used to pass messages from my jabber server to other jabber servers.
When any "c2s" (client to server) connections start or end, and whether or not the connections are encrypted. These connections are made from the clients on the users' machines to my server.
To say it very plainly, no message contents are ever logged. Neither is the message routing information (i.e. "John sent a message to Frank"), however by correlating the times of the c2s and s2s connections, it may be possible to figure out that one or more messages were passed between two or more users (i.e. a traffic analysis.)
The closest thing to logging messages is the PostgreSQL database, which contains each user's "buddy list" and authentication information, as well as any undelivered messages. If a user sends a message to somebody who isn't connected, or receives a message while they are not connected, the messages are held in the database until they can be delivered to the recipient. Once the message is delivered, it is deleted from the database.
The web server's access logs are kept for a little over a month, in order to produce monthly statistics on how much traffic the site is getting, and deleted automatically. I don't want to keep them, both for privacy reasons, and because they take up space.
The web server's error logs are not automatically deleted, however I manually trim them when they get too big.
Other log files, including the logs from my DNS and mail servers, are rotated on a daily basis, and any log files older than seven days are automatically deleted, again for both privacy and space reasons.
Cookies are small pieces of text which a web server hands to a browser, and which a browser sends back to that web server whenever requesting a new page. Cookies are used to overcome a limitation of the HTTP protocol, the lack of support for persistent sessions. Web servers handle requests from multiple clients at the same time, and each request is logically separate from any other request. By using cookies, a server-side application can keep different clients separate from each other, and can maintain a "state" for each client. This makes things like shopping carts possible.
The cookie mechanism can be abused, however. If a page includes content (such as banner ads) from a third party, and that third party includes cookies with their responses (as all banner-ad companies do), your browser will normally keep their cookies as well- and if you visit some other page on a different site, which happens to include content from the same third party, that first cookie goes back with the new request- which means this third party is able to track your visits to BOTH web sites under the same identifier. This behaviour is exactly why DoubleClick was sued back in January 2000- a case which they ultimately settled after two and a half years.
The only times I use cookies on my web sites are:
The qmail site, and parts of the main www.jms1.net site, allow you to choose a colour scheme for the site. At the moment the choices are the default "black background" or a boring "white background", however the mechanism can be extended if I ever decide to write more stylesheets. When you choose a colour scheme, the server sends a cookie to your browser which looks like "stylesheet=black" or "stylesheet=white", and the entire mechanism is explained on this web page. As you can see, there is no personally identifiable information in the cookie- just your preference for the background colour.
The cookie itself is shared with other sites ending with ".jms1.net", so if you like the white background on one site, all of my sites (which are written to use the selectable background mechanism) will respect your preference. The cookie itself is sent with a one-year expiration date.
When writing or maintaining web sites for clients, I may need to use cookies to hold a session identifier. I don't consider these sites to really be "my" web sites, since they are written for clients, but they may be hosted on one of my servers while in development so I figure it's better to at least mention them here.
Some web sites embed javascript code into their pages which find ways to track you. I don't do anything like that. The only times I use javascript are:
The jabberd2 index page on the www.jms1.net site uses AJAX to have the server calculate the values for the SRV records. The relevant portion of the page makes it very clear how to view the script (i.e. view the page source) and offers a link to the code running on the server.
There is another version of the SRV record generator which uses pure javascript within the page to compute the SRV records, rather than using AJAX to pass the data back to the server. Again, the script can be inspected by viewing the page source.
There is a testing page on the www.jms1.net site where I was playing with AJAX to make sure I understood how it worked. When the page loads, it calls a function which requests a count (the number of IPs in my RBL) from the server. There is also a button to repeat the request, without reloading the entire page. The javascript code can be viewed in the page source (it's actually very simple) and the server-side code is a simple CGI which does a "SELECT COUNT(*)" query from the rbl database and returns the total.
Many of my pages have a block of javascript which includes a Google banner ad at the bottom. The javascript itself was supplied by Google. The code in my page sets a few variables to configure the appearance of the ad block, and then loads a single javascript file from Google's server. Google's server does set a cookie, which means when you visit other sites before or after visiting my site, Google is able to track all of the sites you have visited which contain their ads- and they do track you.
My recommendation, and what I do myself, is to use Firefox with the NoScript and Adblock or Adblock Plus plug-ins, and configure Firefox to either not accept third-party cookies (i.e. cookies from Google when you're actually visiting my web site) or to delete all cookies when Firefox exits (which is what I do.)
I make no secret of the fact that I use Google ads on my site. I do it because it's relatively non-intrusive (to the page) and because when people click on the ads (not just look at them), Google sends me money. However, Google's ad system can be considered somewhat intrusive from a privacy standpoint, so you may wish to either block the ads, or at least block the cookies involved with the ads. Please see below for a better explanation.
The menus at the top of the qmail site are NOT done using javascript, they are pure CSS (stylesheets.) Each menu exists, but has a location which is far off of the left side of the browser window (and therefore not visible.) When the mouse hovers over the proper element, the menu's location is changed to make it visible within the browser window. The idea, and much of the code, came from this web site.
I have been participating in Google's AdSense program for the past few years, allowing Google to place banner ads at the bottom of most of the pages on the site.
The AdSense program is implemented by my including a block of HTML code supplied by Google at the bottom of every page. This block of HTML does use both javascript and cookies. The javascript code and the cookies are both tied to, and supplied by, Google. In particular, those cookies are never sent to my server- they are only sent to Google's servers, when your browser retrives the content for the ad itself. My only involvement is that, whenever somebody clicks on an ad, Google adds a little bit of money to my account- and when the account reaches USD $100.00, they send me a check. I'm certainly not getting rich off of it, but it is kinda nice to get those checks every once in a while.
Google has notified me that on 2009-04-08, they will start using what they call "interest-based advertising." I'm supposed to refer people to Google's Advertising and Privacy page for information about Google's policies, so I've provided the link.
With that out of the way, here's what I think is actually happening. You may remember that Google bought out DoubleClick a while back. It looks like Google is going to start using DoubleClick's system to select the ads it shows, rather than using their current system, which chooses ads based on the content of the page in which the ad appears. This also means that any Google ads served will be adding to that database of what people are interested in- basically, anybody who sees a Google ad served by my web site will be adding "qmail" to the list of things DoubleClick knows they're interested in.
I'm not comfortable helping Google, or anybody, build a database like this- especially since its primary use is to show advertising, which is something I don't particularly like to begin with.
So I'm going to make a few recommendations about how you can protect yourself from what I personally see as an invasive system.
Visit http://optout.doubleclick.net/cgi-bin/optoutgoogle.pl to reset your DART cookie's value to "OPT_OUT". This supposedly keeps DoubleClick (and now Google) from collecting information about your visit, assigning you a different DART cookie, or otherwise tracking your online activities. It will not, however, prevent you from seeing the ads themselves- it just means that the ads you see will not be targeted to an interest profile.
Use Firefox, and configure it to not accept third-party cookies (which means if you're looking at a page on "qmail.jms1.net", Firefox will ignore cookies sent by other servers.) If you already have such a cookie, Firefox will still send it with requests to the domain contained in the cookie, so when you turn on this setting, make sure to delete all of your existing cookies (or at least the ones you're not comfortable with.) This will not prevent you from seeing the ads, but it will prevent Firefox from storing the cookies which the ad vendors try to send you, which means they won't be able to track you as easily.
You may also want to configure Firefox to remove all cookies when it exits. Again, this will not prevent you from seeing the ads, but it will prevent DoubleClick from "following" you from one web page to another.
Use the Adblock or Adblock Plus plug-ins for Firefox. Adblock allows you to create a list of URL fragments which, if found in a link, will not be retrieved. Adblock Plus adds the ability to use a pre-defined list of blocked items from one of several servers around the world. If you use the normal Adblock, make sure to block the "doubleclick.net" and "googlesyndication.com" domains, which are used by the old DoubleClick and the current Google ad-serving systems.
Privoxy is a non-caching web proxy which removes most banner ads and tracking cookies. It's meant for use with browsers which don't have the same level of customization that Firefox does. This is what I used to use before Firefox added the option to delete cookies when Firefox exits.
I'm not entirely comfortable with the idea of continuing to host Adsense ads, if they're going to tie into the DoubleClick cookies. I've been blocking both the ads and the cookies from my own browsers for several years, however others may not mind being tracked like this- and if you don't mind being tracked (and occasionally clicking an ad or two) then I don't mind if Google wants to pay me for your clicks. The point is that it's an INFORMED decision on your part.
If an email is sent to one of my honeypot addresses, a full copy of the message is retained as evidence in case I need to research a message which may not be spam. These evidence files are kept for a minimum of one year.
Email messages sent to a legitimate email address on this server may be retained within that mailbox indefinitely, at the discretion of the owner of that mailbox. Personally speaking, I tend to be a bit of a pack-rat, mostly because I'm not sure if I'll need to refer back to something in the future, and because I don't really have the time to go through all of my stored messages to find and delete the things I don't really need. However, I'm getting a little bit better about deleting current things which I don't really need to keep...
Messages sent to a mailing list hosted on the server may be kept in the list archives indefinitely.
If you're going to send anything private or sensitive, or even if not, I highly recommend you use encryption, such as the free GnuPG (or GPG4Win, for Windows users) or the commercial PGP products. My public key is available here.
If you have any questions about this policy, please contact me at the email address listed below.