To demonstrate just how much of a problem it really is, I have created a system that counts spam as it is identified by my spam filtering program (SpamBayes), places the count into a database (RRDtool) and periodically updates charts showing the rate of incoming spam over a period of time (see bottom of page for more detail).
![]() ![]() ![]()
|
People have been asking me for source code for this page. The truth is there isn't much source code needed to get this going. There is probably a miriad of other ways to get the same thing accomplished, this is just one of them.
I use procmail and spambayes to filter my mail. Spambayes comes with a handy script called hammie that you use in your procmail rules. Hammie will insert a header into every email with a determination of the probability of the e-mail being spam.
Before anything, you will need to create the RRD file. (For more info on rrdtool use google). I don't remember how I created mine, but it's something like:
rrdtool create spam.rrd --step 3600 DS:count:ABSOLUTE:864000:0:100000 \
RRA:AVERAGE:.5:1:87600 \
RRA:MIN:.5:288:3650 \
RRA:MAX:.5:288:3650
Here is the relevant part of my .procmailrc file. This calls the spamcount.py script whenever spam is encountered:
# Call spambayes hammie
:0fw
| /usr/local/bin/hammiefilter.py
# SPAM?
:0 c: /home/grisha/.procmail/spamcount.lck
* ^X-Spambayes-Classification: spam
| /home/grisha/.procmail/spamcount.py
:0:
* ^X-Spambayes-Classification: spam
$HOME/mail/spam
The spamcount.py is a Python script that updates the RRD file. It
uses the RRDtool interface for Python:
#!/usr/local/bin/python
import time
import sys
import RRDtool
try:
rrd = RRDtool.RRDtool()
rrd.update(("/home/grisha/.procmail/spam.rrd", "%d:1" % int(time.time())))
finally:
# consume all input
sys.stdin.read()
Finally you need a script to generate the graphs, which you'd call from cron
on a regular interval. Mine looks like this:
#!/usr/local/bin/python
import time
import sys
import RRDtool
rrd = RRDtool.RRDtool()
rrd.graph(("spam.gif",
"-s", "1010000000",
'--title=Spam Graph. Last Updated: %s' % time.ctime(),
"DEF:count=/home/grisha/.procmail/spam.rrd:count:AVERAGE",
"CDEF:hr=count,86400,*",
'LINE2:hr#ff0000:Spams/Day'))
rrd.graph(("spamweek.gif",
"-s", "-604800",
'--title=Spam Graph. Last Updated: %s' % time.ctime(),
"DEF:count=/home/grisha/.procmail/spam.rrd:count:AVERAGE",
"CDEF:hr=count,3600,*",
'LINE2:hr#ff0000:Spams/Hr'))
rrd.graph(("spamday.gif",
"-s", "-86400",
'--title=Spam Graph. Last Updated: %s' % time.ctime(),
"DEF:count=/home/grisha/.procmail/spam.rrd:count:AVERAGE",
"CDEF:hr=count,3600,*",
'LINE2:hr#ff0000:Spams/Hr'))
You also need a script to copy/upload the graphs to your website, this I leave
as an excercise for the reader.