Sendmail Milters in Python
See the FAQ | Download now |
Subscribe to mailing list |
Overview |
pydspam |
libdspam
Sendmail introduced a
new API beginning with version 8.10 -
libmilter. The milter module for Python
provides a python interface to libmilter that exploits all its features.
Sendmail 8.12 officially releases libmilter.
Version 8.12 seems to be more robust, and includes new privilege
separation features to enhance security. Even better, sendmail 8.13
supports socket maps, which makes pysrs much more
efficient and secure. I recommend upgrading.
This package provides a robust toolkit for Python milters, and the beginnings of a general purpose mail
filtering system written in Python.
At the lowest level, the 'milter' module provides a thin wrapper around the
sendmail libmilter API. This API lets you register callbacks for
a number of events in the
process of sendmail receiving a message via SMTP.
These events include the initial connection from a MTA,
the envelope sender and recipients, the top level mail headers, and
the message body. There are options to mangle all of these components
of the message as it passes through the milter.
At the next level, the 'Milter' module (note the case difference) provides a
Python friendly object oriented wrapper for the low level API. To use the
Milter module, an application registers a 'factory' to create an object
for each connection from a MTA to sendmail. These connection objects
must provide methods corresponding to the libmilter callback events.
Each event method returns a code to tell sendmail whether to proceed
with processing the message. This is a big advantage of milters over
other mail filtering systems. Unwanted mail can be stopped in its
tracks at the earliest possible point.
The Milter.Milter class provides default implementations for event
methods that
do nothing, and also provides wrappers for the libmilter methods to mutate
the message.
The 'spf' module provides an implementation of
SPF useful for detecting email forgery.
The 'mime' module provides a wrapper for the Python email package that
fixes some bugs, and simplifies modifying selected parts of a MIME message.
Finally, the bms.py application is both a sample of how to use the
Milter and spf modules, and the beginnings of a general purpose SPAM filtering,
wiretapping, SPF checking, and Win32 virus protecting milter. It can
make use of the pysrs package when available for
SRS/SES checking and the pydspam package for Bayesian
content filtering. SPF checking
requires
pydns. Configuration documentation is currently included as comments
in the sample config file for the bms.py milter.
See also the HOWTO and
Milter Log Message Tags.
Python milter is under GPL. The authors can probably be convinced to
change this to LGPL if needed.
Milters can run on the same machine as sendmail, or another machine. The
milter can even run with a different operating system or processor than
sendmail.
Sendmail talks to the milter via a local or internet socket.
Sendmail keeps the
milter informed of events as it processes a mail connection. At any
point, the milter can cut the conversation short by telling sendmail
to ACCEPT, REJECT, or DISCARD the message. After receiving a complete
message from sendmail, the milter can again REJECT or DISCARD it, but it
can also ACCEPT it with changes to the headers or body.
What can you do with a milter?
Documentation for the C API is provided with sendmail. Miltermodule
provides a thin python wrapper for the C API. Milter.py provides a simple
OO wrapper on top of that.
The Python milter package includes a sample milter that replaces dangerous
attachments with a warning message, discards mail addressed to
MAILER-DAEMON, and demonstrates several SPAM abatement strategies.
The MimeMessage class to do this used to be based on the
mimetools and multifile standard python packages.
As of milter version 0.6.0, it is based on the email standard
python packages, which were derived from the
mimelib project.
The MimeMessage class patches several bugs in the email package,
and provides some backward compatibility.
The "defang" function of the sample milter was inspired by
MIMEDefang,
a Perl milter with flexible attachment processing options. The latest
version of MIMEDefang uses an apache style process pool to avoid reloading
the Perl interpreter for each message. This makes it fast enough for
production without using Perl threading.
mailchecker is
a Python project to provide flexible attachment processing for mail. I
will be looking at plugging mailchecker into a milter.
TMDA is a Python project
to require confirmation the first time someone tries to send to your
mailbox. This would be a nice feature to have in a milter.
There is also a Milter community website
where milter software and gory details of the API are discussed.
Is a milter written in python efficient?
The python milter process is multi-threaded and startup cost is incurred
only once. This is much more efficient than some implementations that
start a new interpreter for each connection. Testing in a production
environment did not use a significant percentage of the CPU. Furthermore,
python is easily extended in C for any step requiring expensive CPU
processing.
For example, the HTML parsing feature to remove scripts from HTML attachments
is rather CPU intensive in pure python. Using the C replacement for sgmllib
greatly speeds things up.
Goals
Confirmed Installations
Please email
me if you successfully install milter on a system not mentioned below.
Operating System | Compiler | Python | Sendmail |
milter |
Mandrake 8.0 | gcc-3.0.1 | 2.1.1 | 8.12.0 |
0.3.3 |
Mandrake 8.0 | gcc-2.96 | 2.0 | 8.11.2 |
0.3.6 |
RedHat 6.2 | egcs-1.1.2 | 2.2.2 | 8.11.6 |
0.5.4 |
RedHat 7.1 | gcc-2.96 | ? | 8.12.1 |
0.3.5 |
RedHat 7.3 | gcc-2.96 | 2.2.2 | 8.11.6 |
0.5.5 |
RedHat 7.3 | gcc-2.96 | 2.3.3 | 8.13.1 |
0.7.2 |
RedHat 7.3 | gcc-2.96 | 2.4.1 | 8.13.5 |
0.8.4 |
RedHat 8.0 | gcc-3.2 | 2.2.1 | 8.12.6 |
0.5.2 |
RedHat 9.0 | gcc-3.2.2 | 2.4.1 | 8.13.1 |
0.8.2 |
RedHat EL3 | gcc-3.2.3 | 2.4.1 | 8.13.5 |
0.8.4 |
Debian Linux | gcc-2.95.2 | 2.1.1 | 8.12.0 |
0.3.7 |
Debian Linux | gcc-3.2.2 | 2.2.2 | 8.12.7 |
0.5.4 |
AIX-4.1.5 | gcc-2.95.2 | 2.1.1 | 8.11.5 |
0.3.3 |
AIX-4.1.5 | gcc-2.95.2 | 2.1.1 | 8.12.1 |
0.3.4 |
AIX-4.1.5 | gcc-2.95.2 | 2.1.3 | 8.12.3 |
0.4.2 |
AIX-4.1.5 | gcc-2.95.2 | 2.4.1 | 8.13.1 |
0.8.4 |
Slackware 7.1 | ? | ? | 8.12.1 |
0.3.8 |
Slackware 9.0 | gcc-3.2.2 | 2.2.3 | 8.12.9 |
0.5.4 |
OpenBSD | ? | 2.3.3? | 8.13.1? |
0.7.2 |
SuSE 7.3 | gcc-2.95.3 | 2.1.1 | 8.12.2 |
0.3.9 |
FreeBSD | gcc-2.95.3 | 2.2.1 | 8.12.3 |
0.4.0 |
FreeBSD | gcc-2.95.3 | 2.2.2 | ? |
0.5.5 |
FreeBSD 4.4 | gcc-2.95.3 | ? | 8.12.10 |
0.6.6 |
Enough Already!
Nearly a dozen people have emailed me begging for a feature to copy
outgoing and/or incoming mail to a backup directory by user. Ok, it
looks like this is a most requested feature for 0.5.6. In the meantime,
here are some things to consider:
- If you want to equivalent of a Bcc added to each message, this
is very easy to do in the python code for bms.py. See below.
- If you want to copy to a file in a directory (thus avoiding having to
set up aliases), this is slightly more involved. The bms.py milter already
copies the message to a temporary file for use in replacing the message body
when banned attachments are found. You have to open a file, and copy the
Mesage object to it in eom().
- Finally, you are probably aware that most email clients already
keep a copy of outgoing mail? Presumably there is a good reason for
keeping another copy on the server.
To Bcc a message, call self.add_recipient(rcpt) in envfrom after
determining whether you want to copy (e.g. whether the sender is local). For
example,
def envfrom(...
...
if len(t) == 2:
self.rejectvirus = t[1] in reject_virus_from
if t[0] in wiretap_users.get(t[1],()):
self.add_recipient(wiretap_dest)
if t[1] == 'mydomain.com':
self.add_recipient('<copy-%s>' % t[0])
...
To make this a generic feature requires thinking about how the configuration
would look. Feel free to make specific suggestions about config file
entries. Be sure to handle both Bcc and file copies, and designating what
mail should be copied. How should "outgoing" be defined? Implementing it is
easy once the configuration is designed.
|