subject: Anti-Spam Solutions and Security - Part 1
posted: Sat, 08 Jul 2006 12:16:48 +0100


[online version has many links - Stu]

http://www.securityfocus.com/infocus/1763

Anti-Spam Solutions and Security - Part 1
Dr. Neal Krawetz 2004-02-26

1. Overview

In a recent survey, 93% of respondents reported dissatisfaction with
the large volume of unsolicited email (spam) they receive. [ref 1]
The problem has grown to the point where nearly 50% of the world's
email is spam [ref 2], yet only a few hundred groups are responsible.
[ref 3] Many anti-spam solutions have been proposed and a few have
been implemented. Unfortunately, these solutions do not prevent spam
as much as they interfere with every-day email communications.

The problems posed by spam have grown from simple annoyances to
significant security issues. The deluge of spam costs up to an
estimated $20 billion each year in lost productivity -- according to
the same document, spam within a company can cost between $600 and
$1,000 per year for every user.[ref 4]

1.1 Security issues

In addition to the wasted time spent viewing and deleting spam, spam
also poses security risks including:

Identity theft. Phishing and scams are distributed as spam, directly
leading to identity theft and fraud. According to the Anti-Phishing
Working Group, phishing spam increased 52% in January. [ref 5]

Viruses. New viruses, worms, and malware, such as Melissa, Love Bug,
and MyDoom use spam techniques to propagate after being triggered by
the user.

Combining exploits and spam. The distinction between malicious
hackers and spammers has become less obvious. Many spammers have
incorporated malicious code that targets browser, HTML, and
Javascript vulnerabilities. For example, on 31-December-2002 a group
of hackers in Brazil sent spam containing a hostile Javascript to
millions of users. People that viewed this spam from Hotmail
unknowingly compromised their accounts. As another example, the
recent URL display problem with Internet Explorer, where a "%01"
before the hostname can be used to hide the real hostname [ref 6],
was incorporated into spam within a few weeks of the public
announcement.

Combining viruses and spam. It is widely believed that some viruses
are designed to assist spammers. For example, the SoBig worm
installed open proxies that were used to relay spam. As spam becomes
more prevalent, the use of malware and spyware to support spam is
likely to increase.

The existing and proposed anti-spam solutions attempt to mitigate the
spam problem and address security needs. By correctly identifying
spam, the impact from email viruses, exploits, and identity theft can
be reduced. These solutions implement various types of security in an
effort to thwart spam.

Current anti-spam solutions fall into four primary categories:
filters, reverse lookups, challenges, and cryptography. Each of these
solutions offers some relief to the spam problem, but they also have
significant limitations. The first part of this two-part paper looks
at filters and reverse lookup solutions. The second part focuses on
the various types of challenges, such as challenge-response and
computational challenges as well as cryptographic solutions. While
there are many different aspects to these solutions, this paper only
discusses the most common and significant concerns -- this paper is
not intended to be a complete listing of implementation options,
solutions, and issues.

1.2 Common terminology

Sender. The person or process that is responsible for generating
(initiating) the email.

Recipient. Any email account that receives the email. This may be
specified in the email as a "To:", "CC:", or "BCC:".

1.3 Filters

Filters are used by a recipient system to identify and organize spam.
There are many different types of filter systems including:

Word lists. Simple and complex lists of words that are known to be
associated with spam. For example, "viagra".

Black lists and White lists. These lists contain known IP addresses
of spam and non-spam senders, respectively.

Hash-tables. These systems summarize emails into pseudo-unique
values. Repeated sightings of hash values are symptomatic of a bulk
mailing.

Artificial Intelligence and Probabilistic systems. Systems such as
Bayesian networks are used to learn word frequencies and patterns
that usually are associated with both spam and non-spam messages.

Filters are ranked based on their false-negative and false-positive
results. A false negative indicates an actual spam message that
manages to pass the filter. In contrast, a false positive indicates a
non-spam email that was incorrectly classified as spam. An ideal spam
filter would generate no false-positives and very few false-
negatives.

These filter-based anti-spam approaches have three significant
limitations:

Bypassing filters. Spam senders and their bulk-mailing applications
are not static -- they rapidly adapt around filters. For example, to
counter word lists, spam senders randomize the spelling of words
("viagra", "V1agra", "\/iaagra"). Hash-busters (sequences of random
characters that differ in each email) were created for bypassing hash
filters. And the currently popular Bayesian filters are being
bypassed by the inclusion of random words and sentences. Most spam
filters are only effective for a few weeks at best. In order to
maintain the viability of anti-spam systems, filter rule sets must be
constantly updated -- usually on a daily or weekly basis.

False-positives. The more effective a spam filter, the higher the
probability of misclassifying a desirable email as spam. For example,
email containing the word "viagra" (e.g., the spam text "Free viagra"
or a non-spam personal email "Hey, did you see that funny viagra
commercial during the superbowl?") is almost certain to be marked as
spam regardless of the content. Similarly, email from Comcast's
24.8.0.0/15 subnet is blindly blocked by the SORBS blacklist because
it is associated with DHCP addresses and not because the sender is
associated with spam. Conversely, spam filters that generate
virtually no false-positives are likely to generate a large amount of
false-negatives.

Filter reviewing . Due to the possibility of false-positives,
messages marked as spam are usually not immediately deleted. Instead,
these messages are placed in "spam mailboxes" for future review.
Unfortunately, this means that users still must view the spam, even
if only by the subject, as they search for misclassified email. In
essence, filters only assist in sorting incoming email.

More important than the limitations of spam filters is the common
myth around the success of filters -- there is a widely held belief
that filters stop spam. Spam filters do not stop spam. In all cases,
the spam is still generated, still traverses the network, and still
gets delivered. And unless the user does not mind missing the
occasional misclassified desirable email, the spam is still viewed.
While filters do help organize and separate email into spam and non-
spam groupings, filters do not prevent spam.

1.4 Reverse lookup

Nearly all spam uses forged sender ("From:") addresses; very few spam
emails use the sender's true email address. Furthermore, most forged
email addresses appear to come from trusted domains. For example, in
15 months our spam archive collected 9300 emails that claimed to come
from 2400 unique domains. The "yahoo.com" domain accounted for nearly
20% of sender addresses in the archive, but spam that actually came
from the "yahoo.com" domain accounted for less than 1%. Similarly,
"aol.com" and "hotmail.com" accounted for 5% each, and "msn.com"
accounted for 3% even though spam, originating from all of these
domains (cumulative), accounted for less than 1% of all spam
received.

Spam senders forge email for numerous reasons.

Illegal. Many spam messages are scams and illegal in most countries.
By forging the sender address, the spam sender can remain anonymous
and prevent prosecution.

Undesirable. Most spam senders are aware that their messages are
undesirable. By forging the sender address, they can mitigate the
repercussion from sending millions of messages to millions of angry
recipients.

ISP limitations. Most Internet service providers have contract
clauses that prevent spamming. By forging the sender address, they
reduce the likelihood of having their ISP cancel their network
access.

By addressing the forgery problem, spam senders will lose the ability
to remain anonymous. Without being able to operate anonymously, laws
such as the U.S.-based CAN-SPAM Act will become enforceable for
spammers operating from and in the United States.

In an effort to limit the ability to forge sender addresses, a number
of proposed systems have surfaced for validating a sender's email.
These systems include:

Reverse Mail Exchanger (RMX). <http://www.ietf.org/internet-
drafts/draft-danisch-dns-rr-smtp-03.txt>

Sender Permitted From (SPF). <http://spf.pobox.com/>

Designated Mailers Protocol (DMP). <http://www.pan-am.ca/dmp/>

These approaches are very similar to each other and in many ways they
are identical. DNS is a global network service used to match IP
addresses with hostnames and vice versa. In 1986 DNS was extended to
associate mail exchanger ("MX") records. [ref 7] When delivering
email, a mail server determines where to pass the message based on
the MX record associated with the recipient's domain name.

Similar to MX records, the reverse lookup solutions define reverse-MX
records ("RMX" for RMX, "SPF" for SPF, and "DMP" for DMP) for
determining whether email from a particular domain is permitted to
originate from any particular IP address. The basic idea is that
forged email addresses do not originate from the correct RMX (or SPF
or DMP) address range and therefore can be immediately identified as
forged.

While these solutions are viable in certain situations, they share
some significant limitations.

1.4.1 Host-less and vanity domains

The reverse lookup approach requires email to originate from a known
and trusted mail server located at a well-known IP address (the
reverse-MX record). Unfortunately, the majority of domain names are
not associated with static IP addresses. Omitting cyber squatters,
the general case includes individuals and small companies that want
to use their own domain rather than their ISP's, but cannot afford
their own static IP address and mail server. DNS registration hosts,
such as GoDaddy, provide free mail forwarding services to people that
register host-less or vanity domains. Although these mail forwarding
services can manage incoming email, they do not offer free out-going
email access.

Reverse-lookup solutions cause a few problems for these host-less and
vanity domain users:

No reverse-MX record. People sending email from a host-less or vanity
domain simply configure their mail application to send email from
their registered domain name. Unfortunately, a lookup of the sender's
IP address will not find the sender's domain, and a lookup of the
sender's domain may not find the correct reverse-MX record. The
former is particularly common for mobile, dialup, and other users
that frequently change IP addresses.

No outgoing mail. One possible solution requires relaying all
outgoing email through the ISP's SMTP server. This would provide a
valid reverse-MX record for sending email. Unfortunately, many ISP's
do not permit relaying when the sender's domain is not the same as
the ISP's domain.

In both cases, someone that uses a vanity domain, or a domain that
does not have its own mail server, will be blocked by reverse-lookup
systems.

1.4.2 Mobile computing

Mobile computing is a very common practice. People take their laptops
to conferences, off-site meetings, and home in order to work away
from the office or in a location that is convenient. Hotels,
airports, and even coffee shops cater to the mobile computing crowd.
Unfortunately, the reverse-lookup solution will likely prevent many
mobile users from sending email.

Sending directly. There are two ways to send email. A user can login
to a mail system using an external POP/IMAP/SMTP account, web mail or
similar service, or a user can send email directly. Most companies do
not permit external access to their mail services; mobile users
usually configure their laptops to send email directly.
Unfortunately, the problems with sending email directly are the exact
same as the problems with host-less domains -- a reverse lookup of
the domain will not include the sender's IP address, and a reverse
lookup of the senders IP address will not reveal the domain.
Mail relaying. The alternative to sending directly requires all
companies and domain systems to provide external mail services for
their off-site and mobile users. In many situations, this is both
undesirable and impractical. As an example, from a strictly network-
security viewpoint, POP3 transmits usernames and passwords in plain
text. Thus, any attacker sniffing the network will see valid login
credentials. IMAP can be used with SSL and supports secure
authentication, but not all servers support this. SMTP also supports
SSL or TLS but again, many organization's servers do not support this
or use only server-side certificates. Web mail over HTTPS is only as
secure as the client-side certificates. Since most sites only use
server-side certificates, HTTPS offer very little protection from man-
in-the-middle network attacks.

While reverse-lookup solutions are viable for internal networks,
these are not globally practical for external practice. Companies
that wish to support host-less domains, vanity domains, and mobile or
off-site users may wish to reconsider implementing reverse-lookup
anti-spam technologies.

2. Summary

Spam has reached epidemic proportions and people are looking for
quick fixes of any kind. Spam filters are the most successful
solution to date -- filters attempt to identify spam and limit a
recipient's exposure. But filters do not prevent spam any more than
recording a television show with a VCR prevents TV commercials.
Reverse-lookup systems attempt to address the forgery problem. While
reverse lookups are viable in closed environments, such as a
corporate internal network, the solutions are not general enough for
worldwide acceptance.

Part II of this investigation will focus on challenge-based systems
and proposed cryptographic solutions.

About the author

Neal Krawetz has a Ph.D. in Computer Science and over 15 years of
computer security experience. Dr. Krawetz is considered one of the
leading experts in spam research and anti-spam technologies. In
addition to studying the nature of spam, he leads the External Threat
Assessment Team (ETAT) at Secure Science Corporation, a professional
services and software company that develops advanced technology
dedicated to protecting online assets.

References

[ref 1] "Majority in Favor of Making Mass-Spamming Illegal Rises to
79% of Those Online." The Harris Poll ® #38. July 16, 2003.

[ref 2] "Spam On Course to Be Over Half of All Email This Summer,"
Brightmail press release. July 1, 2003.

[ref 3] According to SpamHaus, a spam content tracking organization,
less than 200 spam groups generate more than 90% of spam messages.
SpamHaus ROKSO, September 22, 2003.

[ref 4] Source: "Spam Costs $20 Billion Each Year in Lost
Productivity", by Jay Lyman. December 29, 2003.

[ref 5] Source: "Phishing e-mail fraud rises 52% in January, report
says", February 18, 2004.

[ref 6] Reference: "Multiple Browser URI Display Obfuscation
Weakness"

[ref 7] Source: "Domain System Changes and Observations", RFC973 by
Paul Mockapetris. January 1986.


Copyright 2006, SecurityFocus

---
* Origin: [adminz] tech, security, support -
http://cyberdelix.net/adminz/


generated by msg2page 0.06 on Jul 21, 2006 at 19:03:28