Overview of spam
Overview of techniques
Details of 11 anti-spam techniques
Oriented towards servers
Mostly theory, little implementation
Some final thoughts
What:
Unwanted e-mail: UCE and Viruses
Attack on the Internet
80 to 90% of e-mail traffic
SMTP not up to the task but hard to replace
Why?
Hard to block
Cheap to send
"Guaranteed" delivery
It works
Don't let it work
Terminology
False Positive -- Non-spam marked as spam
False Negative -- Spam that gets through
Accuracy
MUA/MTA
Expensive -- Resources, not dollars
http://greylisting.org/
Description
Self-tuning (mostly)
Blocks spammers that try once
Blocks broken mail servers
Catches some viruses
Allows time for DCC systems to work
Low false-positive rate
Cheap to run
High rejection rate (85%)
DNSBL
Many different lists to pick from
Relies on network access to run
Can introduce latencies
Best as part of a weighting system
Inexpensive in resources
High False Positives
Black/White List
Maintained by user, usually
High accuracy
Inexpensive to run
Expensive to maintain
SPF
http://spf.pobox.com/
Incorporates E-mail CallerID
Prevents spoofed sender addresses
Domain owner publishes mail server information
Problematic for forwarding mail
Works around problems in SMTP
Inexpensive to run
Low to no false positives when set up properly
ClamAV http://www.clamav.net/
Many commercial offerings
Low false positive rate
Never send warning replies except in SMTP
Expensive to run
Hashcash
http://www.hashcash.org/
Proof of work
Few messages or many CPUs
Low false positives
Expensive to generate
Cheap to check
Can be added to MUAs or MTAs
Tagged Message Delivery Agent/Confirmation
http://tmda.sourceforge.net/
Whitelist addresses you send to
Require sender to confirm messages
Relies on envelope sender address
Can lead to mailing loops
Best when combined with other mechanisms
Low false positives
Expensive to run
Rules
SpamAssassin
Regular expressions and the like
Easily fooled
/\b(?:join|register|order|apply) .{0,10}(?-i:T)oday\b/i
/\bunclaimed (?:funds|money|prizes?|rewards?)\b/i
Expensive to run
High falses
Spammers regularly check their messages against these