At Saturday 2004-07-17 21:43 "Angela Kahealani" <***@kahealani.com>
posted <***@kahealani.com> to hawaii.inet-providers:
[...] I again forward a message from rsk, with comments by ME as:
< Comment inserted by Angela Kahealani
- - - Begin Forwarded Message
Return-Path: <***@gsp.org>
X-Original-To: ***@kahealani.com
Received: from taos.firemountain.net (taos.firemountain.net
[207.114.3.54])
by ensemada.lava.net (Postfix) with ESMTP id 2A23B30100
for <***@kahealani.com>; Tue, 20 Jul 2004 12:59:30 -1000
(HST)
Received: from gsp.org (core-balt-1-251.dynamic-dialup.coretel.net
[69.72.22.251] (may be forged))
by taos.firemountain.net (8.12.11/8.12.11) with ESMTP id
i6KMx9Ft019797
for <***@kahealani.com>; Tue, 20 Jul 2004 18:59:21 -0400
(EDT)
Received: from cougar.gsp.org (cougar [192.168.0.10])
by gsp.org (8.12.11/8.12.11) with ESMTP id i6KMwkWp019569
for <***@kahealani.com>; Tue, 20 Jul 2004 18:58:47 -0400
(EDT)
Received: from cougar.gsp.org (localhost [127.0.0.1])
by cougar.gsp.org (8.12.11/8.12.11) with ESMTP id i6KMuMuC003982
for <***@kahealani.com>; Tue, 20 Jul 2004 18:56:22 -0400
(EDT)
Received: (from ***@localhost)
by cougar.gsp.org (8.12.11/8.12.10/Submit) id i6KMuMow003979
for ***@kahealani.com; Tue, 20 Jul 2004 18:56:22 -0400 (EDT)
Date: Tue, 20 Jul 2004 18:56:21 -0400
From: ***@gsp.org
To: Angela Kahealani <***@kahealani.com>
Subject: Re: VeriZon SMTP
Message-ID: <***@gsp.org>
References: <***@malasada.lava.net>
<***@kahealani.com> <***@kahealani.com>
<***@malasada.lava.net>
Mime-Version: 1.0
Content-Type: text/plain;
charset=us-ascii
Content-Disposition: inline
In-Reply-To: <***@malasada.lava.net>
User-Agent: Mutt/1.5.4i
Comments: INPUT 207.114.3.54
Comments: HELO taos.firemountain.net
Comments: For More Information, please visit:
<http://lava.net/support/utilities/spammo/guide.html>
X-UID: 2500
[ If you wouldn't mind forwarding this for me again, I'd appreciate it.
Apparently the mail-to-news gateway I'm using at the moment doesn't
handle this newsgroup, hence the ongoing problem, which I need to go
investigate and either fix/work around. Mahalo. ]
mostly Rich, I think - in summary, "Verizon is refusing mail unless
they can VRFY the sender address, this is obsolete, Bad and Wrong,
and they are to blame for everything."]
They're not using VRFY; they're using RCPT. As I said:
"Since most people have long since disabled SMTP VRFY, they
actually construct a message and attempt delivery with RCPT."
This isn't obsolete and wrong: it's ALWAYS been wrong. VRFY was
intended to be used to confirm the existence of addresses; RCPT was
intended to be used to deliver mail, and shouldn't be used unless a
message is actually being delivered...which it's not.
VRFY worked nicely for quite a few years until spammy came along,
promptly abused the hell of VRFY with address harvesters, and thus
provoked nearly everyone into disabling it.
But this is not an excuse for Verizon or anyone else out there to
abuse RCPT in order to simulate now-disabled VRFY functionality.
[ Why do you *think* they're using RCPT? Because they have
just enough smarts to know that nearly everyone turned
off VRFY a long time ago, and that trying this approach using
VRFY would render their own incoming mail unusable. Hence their
rather amazing decision to abuse RCPT in an ill-considered
attempt to craft a workable substitute. To put it another way:
Verizon is attempting to forcibly overide the policies of
everyone who made the decision to turn off VRFY. This is just
as rude and abusive as, say, having a web spider ignore a
robots.txt file. ]
< Which, by the way is a practice of Cyveillance corporation,
< http://www.cyveillance.com/ which totally ignored my robots.txt
< -- Angela Kahealani
And while they're not to blame for everything -- just this -- one
scenario you might want to consider is what will happen when some
annoyed spammer decides to register a $5.95 throwaway domain, set
up DNS for it, point the MX records at *your* boxes, and then drop,
oh, say, 100 million *UNFORGED* messages from unique senders in that
throwaway domain on Verizon's mail servers.
That's the point at which you'd better hope that Verizon has thought
about this and has effective rate-limiting/query-limiting. Given the
obvious lack of thought and consideration that went into this in the
first place, though...I wouldn't count on it.
The MX for my domains has VRFY turned off. Hit it, try to VRFY
anything, and you get "252 VRFY not available" errors. Yet I have
no email problems with Verizon.
As I said, THEY'RE NOT USING VRFY. Would you mind re-reading my
previous message?
That said... there are other ways to verify that the sending HOST is
1) Do a simple DNS lookup to see if the stated hostname exists.
2) Do #1, and see whether it resolves to the sending host's
IP address.
Which breaks for everyone doing virtual hosting or handling mail for
multiple domains through the same outbound mail server. There's also
no requirement that a domain have any hosts in it (more precisely,
any DNS 'A' records) in order to send or receive mail.
3) Check to see whether the host is a MX for the sending address's domain.
Which breaks for a lot of people -- including some very large ISPs
often referred to by 3-letter acronyms -- because their inbound and
outbound mail servers are *not the same boxes*.
< Gee, you mean I am not the only one doing that? -- Angela
But this is orthogonal to what Verizon's doing: they're attempting to
verify that the sending USER exists. And *even if it worked* -- that
is, if it didn't scale poorly, facilitate spammer address harvesting,
and lend itself to DoS attacks, it *still* would be a poor "anti-spam"
measure because it only checks to see that the putative sender is valid:
period. (Think about it. ANY sender. Not necessarily the one who
sent the message. Not even one from the same domain or mail server.
And not necessarily one who isn't a spammer.)
More broadly: Verizon is confusing two tasks: detecting forged senders
and blocking spam/abuse: the former is only of limited use in
accomplishing the latter.
Still more broadly: there are a lot of people doing some Very Stupid
Things as part of ill-considered tactics for stopping spam. These range
from buying "anti-spam" products sold by spammers (e.g. IHateSpam) to
generating backscatter spam (challenge/response) to using filters so
badly written that they reject any message with RFC 2919 headers.
None of this is necessary: robust, simple, scalable, and free methods
which -- at most sites -- stop over 90% of spam are readily available,
thoroughly tested, and well-documented. All it takes is somebody with
a modicum of clue and a few hours. No, they're not panaceas -- nothing
is, and anyone who tells you their method *is*, is a liar or a fool --
but they do such a good job that for many people they're "good enough",
and for others they reduce the scope of what's left to manageable
levels.
So to that end -- since I'm griping about this -- here's the approach
that I use. Let me emphasize "approach": I don't do all these things
on all mail servers, and I don't do them in the exact same way, because
every server/domain gets a different mix of incoming spam. It's always
important to try to figure out what that mix looks like and tailor the
blocking to match it. But most of this will work most of the time for
most people -- and in a lot of cases it's turned out to "good enough"
that more work isn't necessary. In others, it's been "good enough"
that the additional work required is made quite a bit easier by it.
So here goes.
I run sendmail and have had excellent results using a layered approach
to blocking spam. The general idea is to use those measures which
are computationally cheapest first, in order to reduce the burden on
subsequent layers. The approach I'm taking (outlined below) would also
work for other MTAs on other 'nix systems.
I don't do any kind of content analysis: I'm in agreement with Paul
Vixie on this one: either people share our values or they don't. If
they do, then they don't allow spam to flow out of their networks (at
any rate beyond a trickle, which is probably inevitable). If they
don't, then they're either actively supporting spammers or incidentally
supporting them through neglect and incompetence -- and the reason
doesn't really matter to me, my users, my systems or my networks.
More succinctly: systems and networks which emit spam are broken and
should either be repaired immediately or physically disconnected from
the Internet until they are.
More bluntly: I'm not going to waste my resources trying to sort out
clean water from sewage. That responsibility rests with the people
whose servers and networks are spewing effluent through the pipes
designated for water.
1. I use this:
The Spamhaus Project: DROP (Don't Route Or Peer) List
http://www.spamhaus.org/DROP/
at the firewall and router level, or in the sendmail 'access' file
when that's not possible. These are networks which are 100%
controlled by spammers, so no good can come of accepting their traffic.
I've augmented this locally by a few particularly problematic networks;
for example, after reading these:
Call for Internet Death Penalty: Burstnet/Hostnoc
http://groups.google.com/groups?selm=20030708121252.GA14167%40example.com
Call for Internet Death Penalty #2: Optigate/Optinrealbig
http://groups.google.com/groups?selm=***@example.com
Call for Internet Death Penalty #3: Hopone/Superb
http://groups.google.com/groups?selm=***@example.com
their network allocations are now a fixture in my deny lists. It's up
to you, of course, but I see no reason to ever accept another packet
from them.
2. I have configured sendmail to reject all mail from domains which
don't resolve. This also blocks mail from broken mail servers, but
since there's no way to tell them to fix their DNS...
Sendmail comes set up this way by default on most systems.
3. I have set up sendmail to issue a multi-line SMTP greeting banner.
This causes a surprising amount of the malware installed on hijacked
Windows systems to fail, as it's not set up to deal with that. No
doubt future malware will cope with this, but for the last year it's
been very useful. Simple, easy, fast, and satisfying. ;-)
4. I then use a very large list of domains, via the sendmail 'access'
file. This is handy because the access file is hashed, thus lookups
are roughly O(1) no matter how large it becomes. But it's also
error-prone: in fact, during the past two years, every time I've had a
false positive reported to me, this is where I've traced it to on all
but two occasions.
But - considering that I'm using a list of about 128,000 domains and
have had less than a dozen false positives in two years, it seems like
a reasonable approach. Doubly so because this step alone blocks from
30% to 40% of incoming spam with very little overhead. Even more so
because reduces the number of DNSBL queries (see step 8) which not
only reduces my outbound traffic, but the load I impose on the DNSBLs
that I'm using.
Many domain lists are also available; here's a few of them:
http://www.rhyolite.com/anti-spam/unwelcome.html
http://www.river.com/ops/spam/bad-domains.txt
http://www.spamblocked.com/killfile
http://www.znet.com/blocked-domains.html
http://www.cluelessmailers.org/listings/blacklistbydomain.html
http://obob.manilasites.com/
http://www.carl.net/spam/access.txt
http://www.unixgirl.com/blockeddomains.html
http://www.cart00ney.org/blocklist.txt
http://abuse.easynet.nl/spamlist-usage.html
Note: if you use a large list of domains in the sendmail 'access' file,
you will want to RTFM on "makemap" and note the "-c" flag. The speedup
in rebuilding the hash is quite significant.
5. I block all mail from certain TLDs on some mail servers because
the people using those servers don't expect to ever receive mail
from those places. I don't like doing this, because it's such a drastic
measure, but it's too effective a technique not to use. In particular,
I routinely block:
.cn (China)
.kr (Korea)
.tw (Taiwan)
I'm about >this close< to adding .biz to that list.
Of course, if you actually expect to get non-spam mail from those TLDs,
you probably can't do this. This is why I don't block .br, for example:
I have users who actually get non-spam mail from there. But if you
don't, you might want to consider blocking it.
6. I use a few special-purpose rules in the sendmail access file to
take care of spam from hijacked CacheFlow servers, hijacked AOL
proxy servers, often-forged addresses, and so on. Let me know if
you want them: they're pretty simple/short/easy.
7. I use ~150 subdomains (also in the sendmail access file) which
correspond to dynamically-allocated IP space, e.g. "dhcp.example.com".
I don't like doing this either, but it's also too effective not to use:
spam from hijacked PCs on cable/DSL connections is epidemic. I have
been slowly expanding this because it seems to be filling in gaps that
the other measures are missing.
Note: in most cases, the users on such networks are contractually
obligated to use their ISP's designated outbound mail server(s). So
the only SMTP traffic that this measure blocks is (a) spam from zombies
(b) spam from the spammers' own systems and (c) mail from people who
are deliberating violating their own ISP's TOS. It's correct to say
that (c) isn't necessarily spam: but I'm not going to lose any sleep
over blocking it anyway.
< SO, are you blocking kahealani.net (64.65.108.250) for being DSL?
8. I use multiple DNSBLs, each of which targets a slightly different
mix of spam.
For starters, I use
cn-kr.blackholes.us
tw.blackholes.us
for the same reason I block .cn, .kr and .tw -- see step 5 above.
Again, this may not be a reasonable step for everyone, but check
www.blackholes.us for other available DNSBLs that might be. They have
quite a wide selection, both by country and by ISP/host. But locally,
use of those two DNSBLs alone nails about 30% of incoming spam.
I then use these DNSBLs (each listed with DNSBL name and web site)
sbl-xbl.spamhaus.org http://www.spamhaus.org/sbl/
http://www.spamhaus.org/xbl/
dnsbl.ahbl.org http://www.ahbl.org/
list.dsbl.org http://dsbl.org/
dnsbl.njabl.org http://njabl.org/
relays.ordb.org http://ordb.org/
l1.spews.dnsbl.sorbs.net http://www.spews.org/
The Spamhaus SBL+XBL combined DNSBL is a must-have. I have never had
a false positive with it. And the relatively recent addition of the
XBL picks up millions of zombie Windows machines that are spewing spam.
The AHBL augments this nicely, and includes a RHSBL (right-hand-side BL)
which handles blocking by domain name. If you don't want to do step 4,
this is a good substitute.
The DSBL, NJABL, and ORDB all pick up different combinations of open
relays, open proxies, hijacked systems, etc.
The SPEWS list -- despite what some of its less-informed critics have
said -- is very accurate and correctly targets the spam-supporting ISPs
and hosts who are directly responsible for much of the spam we all
endure.
Other DNSBLs that I have either used or am considering using:
Blitzed OPM http://opm.blitzed.org/
PDL http://www.pan-am.ca/pdl
Leadmon http://www.leadmon.net/spamguard/
SORBS http://dnsbl.sorbs.net/
FiveTen http://www.five-ten-sg.com/blackhole.php
NOTE: You should probably not use any DNSBL until you've read its
policies.
NOTE: If you intend to make heavy use of these DNSBLs, you should
probably read their web sites and see about doing zone transfers.
NOTE: I find it very useful to run a local copy of BIND in caching mode
on every mail server, since those servers often get repeatedly pummeled
from the same sets of addresses. This not only enhances performance
locally, but cuts down on the load my servers impose on the DNSBLs.
NOTE: DNSBLs are invoked sequentially by sendmail, so it's a good idea
to put the one that blocks the most spam as seen by your servers first.
But figuring out which that is can be quite an effort. For most
people, the Spamhaus SBL+XBL DNBSL is a pretty good first guess,
though.
9. I'm experimenting with using rbldnsd to run my own internal DNSBL --
replacing, in part, the sendmail 'access' file.
The upside of doing this is that rbldnsd stores information in a very
compact format with a low memory footprint; it's designed to serve
DNSBLs, not as a general purpose DNS server. Another advantage is that
keeping the information in rbldnsd would allow it to be used by
sendmail, postfix, exim, whatever. Yet another is that it can be
queried easily (contrast with the sendmail 'access' file).
The downside is that it's another process to run; it requires a
different format than sendmail (which means reworking scripts, etc.);
and it's one more step that could conceivably fail. (Mitigating this
is that sendmail presumes a non-responding DNSBL means "not listed" and
thus fails soft.)
It's not clear to me yet who this experiment will turn out, but the
early results are promising enough for me to suggest to others as a
possible course of action.
10. My best estimates of the performance of all this is that the local
measures (1-7) block about half the spam that is blocked, and the
DNSBLs (8) block the other half of the spam that is blocked. The
blocking rate itself appears to be somewhere around 93% to 97%: it
varies as spammers switch networks or domains, or activate new groups
of zombies.
The false positive rate is about 1 per month; but I need to caveat that
by stating that unreported false positives may still be lurking. (On
the other hand: my users squawk pretty loud and fast when something
goes wrong, so I don't think there are many.)
NOTE: Assessing performance of anti-spam techniques requires both the FN
(false negative: unblocked spam) and FP (false positive: blocked
non-spam). It's easy to drive either to 0; it's hard to do both at
once.
NOTE: Everybody's incoming spam and non-spam mix is different. The only
way to really figure out which of these steps will best minimize (FP,
FN) is to analyze the statistics. But 1, 2, 3, and some of 8 are
nearly always a good first guess, and in some cases, they solve enough
of the problem that further analysis/measures aren't necessary.
---Rsk
- - - End Forwarded Message
--
Non-Rsk words Copyright 2004 Angela Kahealani. All rights reserved
without prejudice; UCC1-207. All information and transactions are non
negotiable and are private between the parties.
http://www.kahealani.com/