|
Spampal and Regexfilter can do much more than filter spam - they can be used to catch viruses and phish as well. This page details how to do that, plus some other extra filtering rules that I find useful.
The ordering of the various filters is very important. The filters are applied in order, from top to bottom. Since over-processing unwanted mail is a waste of resources, the ordering should be designed to find such mail rapidly; this removes it from the queue and allows the filters to begin work on the next message. The ordering might be rearranged for various purposes; the ordering below is optimised for speed, with a few compromises for functionality. I place all of the tests below at the top of the file containing the ruleset - this means they run first and eject any mail they find, saving the lower-order rules the trouble of processing it.
The rules below make various assumptions about what should and should not be permitted. You should not use them unless you understand what they will do. The rules should be modified to suit your own purposes before use.
All of the rules below use a leading equals sign and a high spam score - this has the effect of instantly marking any matching mail as spam, with no further testing performed on it. If this does not suit you, change it.
@Any-Sender: {cyberdelix\.net} -=Received: 500.0 {123\.456\.789\.123|from www\.cyberdelix\.net} [ANTISPOOF_BLOCK_SPOOFED_INTERNAL] =Any-Sender: -500.0 {cyberdelix\.net} [ANTISPOOF_PASS_FRIENDLY_SMTP_INTERNAL]
The first two rules blacklist my own domainname, except when the mail is sent from one of the IP addresses or hosts listed.
The third rule will never run, except if the mail is NOT spoofed (due to the equals sign in the second rule) - therefore, at this point, any mail sent with my domainname is friendly (non-spam). Mail that matches on the third rule is thus given a minus spam score, and no further testing is done on it.
Do NOT enter "localhost" or similar into the second rule, only enter strings that uniquely identify your own SMTP server (such as its IP or hostname).
Note: these rules may block mail from legitimate services which spoof your address, such as Paypal or mailing lists - still testing this, if so, I imagine one or more additional rules could be added to fix.
Note: hyphen and dot both need to be escaped
Note: the above rule means it's not necessary to whitelist *@yourdomain.com - and indeed, if your whitelist does contain this, you should remove it as it will prevent the anti-spoofing rule from detecting spoofed mails. The antispoofing rule automatically allows through any mail sent from a SMTP server listed in the rule, so as long as all senders @yourdomain.com are using one of these SMTP servers, they are effectively whitelisted, just as they were when *@yourdomain.com was whitelisted. Meaning that there is no danger in removing this entry from the whitelist, if the antispoofing rule is in use.
The anti-spoofing rules should go FIRST in the filter file, so that mails from friendly SMTP servers are not filtered for spam (eg. are allowed to be spammy). Also, putting these rules first means spoofers are immediately detected and ejected.
@From: {@facebookmail\.com} -=Received: 500.0 {mx\-out\.facebook\.com} [ANTISPOOF_BLOCK_SPOOFED_EXTERNAL_FACEBOOK] =Any-Sender: -500.0 {@facebookmail\.com} [ANTISPOOF_PASS_FRIENDLY_SMTP_FACEBOOK]
Note: hyphen and dot both need to be escaped, @-sign does not
=Line: 9999 {^TVqQAAMAAA*} [MIMEAV: Win32 executable variant 1] =Line: 9999 {^TVoAAAEAAAA*} [MIMEAV: Win32 executable variant 2] =Line: 9999 {^TVoAAAAAAAAAAAAAUEUAAE*} [MIMEAV: Win32 executable variant 3] =Line: 9999 {^TVoAAD8AAAAE*} [MIMEAV: Win32 executable variant 4] =Line: 9999 {^TVpLRVJORU*} [MIMEAV: Win32 executable variant 5] =Line: 9999 {^UEsDBAoAA*} [MIMEAV: Zipfile variant 1] @Line: {^UEsDBBQAA*} [MIMEAV: Zipfile variant 2] -=Body: 9999 {name=.*\.(docx|xlsx)} [MIMEAV: Zipfile variant 2] =Line: 9999 {^183GmgAA*} [MIMEAV: WMF file variant 1]
These filters work by finding MIME data which matches the above strings. When a virus is sent via email, it is encoded in MIME. It may change its filename, use a variety of subject lines and message bodies, and/or forge the sender's address; but it cannot forge its own file header, which is faithfully represented in MIME in the email containing the virus. It's not even necessary to decode the MIME; the above "MIME signatures" are functionally equivalent to the signatures used by traditional anti-virus scanners.
Note that "Zipfile variant 2" has a different syntax, using two rules - this is to allow for DOCX and XLSX files, which are actually ZIP files. The above syntax allows DOCX and XLSX files through, while still blocking all other ZIP files. Exceptions for other filetypes such as PPTX could also be added here (not tested).
Below is a table of various filetypes and their MIME signatures. Additional signatures can be determined simply by emailing yourself a file in a given format (for example, .XLS) and examining the raw MIME data. Short is good, too short is bad, though.
extension | MIME signature | notes |
---|---|---|
EXE, COM, SCR, PIF | TVqQAAMAAA | Win32 executable variant 1 |
EXE, COM, SCR, PIF | TVoAAAEAAAA | Win32 executable variant 2 |
EXE, COM, SCR, PIF | TVoAAD8AAAAE | Win32 executable variant 3 |
EXE, COM, SCR, PIF | TVoAAAAAAAAAAAAAUEUAAE | Win32 executable variant 4 |
EXE, COM, SCR, PIF | TVpLRVJORU | Win32 executable variant 5 |
ZIP | UEsDBAoAA | ZIP type 1 |
ZIP | UEsDBBQAA | ZIP type 2 |
GIF | R0lGODlh | not used by viruses but well-used by spammers |
PNG | iVBORw0KGgoAAAAN | not used by viruses but well-used by spammers |
JPG | /9j/4AAQSkZJRgABAQ | not used by viruses but well-used by spammers |
WMF | 183GmgAA | Windows MetaFile format |
BHX | YmVnaW4gNj | Mac BinHex format |
Viruses that arrive in encrypted zipfiles (such as Bagle) are not a problem for the above technique. Encrypted zipfiles have a standard header just like any other zipfile, so the above zipfile filters catch encrypted zips as well.
If the above rules are used, the include of filters_virus.dat (at the top of the filter file) can be removed (or commented out). Also, several virus-specific virus tests included in the default Regexfilter file can be removed (or commented out). These can be found by searching for "[SOBER Sober.P identified]", or by searching for a spam score of 500.0.
=Body: 9999 {http.*\.exe} [CRIPPLED_FILETYPE_GENERIC link to Win32 executable] @Body: "Content-Disposition:" =Body: 9999 {name=.*\.(hta|vbs)} [CRIPPLED_FILETYPE_GENERIC Win32 scripting] @Body: "Content-Disposition:" =Body: 9999 {name=.*\.htm} [CRIPPLED_FILETYPE_GENERIC HTML document]
Note: the HTML document detection finds files attached in HTML format. It does not look for (or mark as spam), HTML mail. It checks the file attachments, not the mail format.
=Any-Sender: 500.0 {Royal Bank of Scotland|bankofscotland\.co\.uk|NatWest|HSBC|lloydstsb\.co|lloyds\.co\.uk|barclays\.co} [CRIPPLED_SENDER_PHISHING rule 1] =Any-Sender: 500.0 {abbeynational\.co\.uk|(\.|@)abbey\.co|halifax\.co\.uk|alliance\-leicester\.co\.uk|(\.|@)egg\.com|cahoot\.com} [CRIPPLED_SENDER_PHISHING rule 2] =Any-Sender: 500.0 {CitiBusiness|(\.|@)citi\.com|citibank\.com|equifax\.com|commercebank\.com|bankofamerica|(\.|@)chase\.com|(\.|@)ally\.com} [CRIPPLED_SENDER_PHISHING rule 3] =Any-Sender: 500.0 {wachovia\.com|americanexpress\.com|bankofthewest\.com|capitalone\.com|nationalcity\.com|tdbanknorth\.com|(\.|@)key\.com} [CRIPPLED_SENDER_PHISHING rule 4] =Any-Sender: 500.0 {hmrc\.gov\.uk|adwords\-noreply@google\.com|networksolutions\.com|westernunion\.com|fdic\.gov} [CRIPPLED_SENDER_PHISHING rule 5]
Note: hyphen and dot both need to be escaped, @-sign does not
Note: these are anti-phishing rules. There are plenty of people who scoff at this approach to anti-phishing, arguing that I'll end up blacklisting the entire internet. While I understand their point that blacklists have limited utility in an unconstrained problem space, I disagree with them, because in the case of phish filtering, the problem space is not unconstrained. It is limited to the most common financial institutions, plus a few ring-ins. This means blacklisting is feasible - it certainly works great for me, and it has done for years, and I spend almost no time maintaining my list.
=SUBJECT: 500.0 {viagra|sildenafil|cialis|vicodin|xanax|regalis|valium|anatrim|phentermine|nicotine| pills|depressant} [CRIPPLED_SUBJECT_GENERIC specific drugs]
=Any-Sender: 500.0 {viagra|sildenafil|cialis|vicodin|xanax|regalis|valium|anatrim|phentermine|nicotine| pills|depressant} [CRIPPLED_SENDER_GENERIC specific drugs]
=SUBJECT: 500.0 {=\?(big5|gb2312|euc\-kr|ks_c|\-(kr|jp)|koi8\-r|windows\-1251|iso\-8859\-9|windows\-1254)\?} [NON_WESTERN_SUBJECT Non-western character set in subject] =CONTENT-TYPE: 500.0 {(big5|gb2312|euc\-kr|ks_c|iso\-2022\-(kr|jp)|koi8\-r|windows\-1251|iso\-8859\-9|windows\-1254)} [NON_WESTERN_CONTENTTYPE Non-western character set in Content-Type]
Note: hyphen and dot both need to be escaped
Below is a table of various character sets and corresponding notes. This info partly from Wiki I think.
character set | notes |
---|---|
big5 | used in Taiwan, Hong Kong and Macau for Traditional Chinese characters |
gb2312 | GB2312 is the registered internet name for a key official character set of the People's Republic of China |
euc-kr | Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese. |
ks_c | Korean |
iso-2022 | Korean/Japanese |
KOI8-R | Russian, which uses the Cyrillic alphabet |
windows-1251 | Russian |
iso-8859-9 | Turkish |
windows-1254 | Turkish |
A summary of the syntax used on this page (taken from section 7.1 of the RegExFilter manual):
symbol | meaning |
= | on match no more rules are tested for this email |
- | negate the result (logical NOT) |
@ | combine this rule with the next rule (logical AND) |
~ | decodes RFC-2047 encoded headers and RFC-2045 encoded bodies |
Note: do NOT put comments at the end of rules, in the RegExFilter filter file. This will cause the rule to stop working. Do this instead:
# comments must be on a separate line from rules From: {@spammer\.com}
related articles: |