chir.ag/tech [archive]

 
 
 
 
 
 
 

/tech home / projects / personal 'blog / about chir.ag

 

ARCHIVE: Anatomy of typical spam

Sat. Jul 15th 2006, 12:36pm:

I'm often bombarded by over 1000 spam emails per hour on two of my dedicated servers that each host 30-40 domains. Using multiple DNS blacklists and custom filters helps cut that number down significantly, until the spammers find new IPs and new ways to bypass my pattern-match text filters. Every once in a while I look at a few spam emails to see if I can identify new patterns and block the most common ones.

I randomly selected one of the spams I hadn't yet deleted from my Thunderbird junk mail folder and decided to pick it apart. Today, instead of going after common spam words and typical spam subject lines, I wondered if there is a pattern in the headers, the way a junk email travels, that can help me identity and stop spams from IP addresses that haven't been blocked yet. It wasn't until I started running IP WHOIS and domain WHOIS on everything in the email that the truly global nature of spam hit me.

Here's the email I analyzed with complete headers and only a few details [edited] by me. Hover on a link to see the country information. Clicking on links will not do anything, no matter how tempting it is to have perfect sex in your life.

Envelope-to: [edited]@chime.tv
Delivery-date: Sat, 15 Jul 2006 12:02:09 -0400
Received: from [60.6.118.44] (helo=glorymail.com)
	by server.chime.tv with smtp
	id 1G2mal-00062M-W3
	for [edited]@chime.tv; Sat, 15 Jul 2006 12:02:09 -0400
Received: from 200.23.242.202
  (SquirrelMail authenticated user
         [edited]@colourconfidence.com);
  by glorymail.com with HTTP id Ab44qw9z008048783;
  Sat, 15 Jul 2006 16:07:43 +0000
Message-Id: <IOOCIh.squirrel@200.23.242.202>
Date: Sat, 15 Jul 2006 16:07:43 +0000
Subject: Can you satisfy your girlfriend?
From: "Derek" <[edited]@colourconfidence.com>
To: <[edited]@chime.tv>
User-Agent: SquirrelMail/1.4.3a
X-Mailer: SquirrelMail/1.4.3a
MIME-Version: 1.0
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Antivirus: AVG for E-mail 7.1.394 [268.10.1/389]

<html> <body> Or you are afraid that she meets with someone who is better than you in bed? Use licensed Viagra and Cialis pills from our drug store. Now you are the best, you have perfect sex in your life! That is US Druq store with quaIity ED medications!<br><br>

CIick here: <a href="http://nnqthf.calmcrush.com/?couacwanoghy"> US Drugs onIine store</a><br><br>

To make your life better .Even your sexuaI partner won't know you are using Viaqra if you'll buy it here.<br><br>

CONFlDENTlAL and SECURE purchase .lnstant shipping! </body> </html>
If that techno-mumbo-jumbo was too dry for your taste, here's the itinerary of the adventurous Ms. S. Pam: Set sails from China [60.6.118.44] on route to United States [chime.tv ?]. While reminiscing the memoirs of her voyage, she strongly recommends you shop at this one store, USDrugs Ltd, that has offices in New York, US and Mumbai, India. And be sure to let the store know she sent you [referrer ID: couacwanoghy] so she gets some brownie points ($$$). The store [calmcrush.com] is in fact owned by a Croatian. Not satisfied with her story of travelling just a few countries, she makes up a saucy encounter story to tease you. Before leaving China, she claims, she had been vacationing in Mexico [200.23.242.202]. It was there that she met a colorful Englishman [colourconfidence.com] and they hit it off. In fact, they had such a glorious time, he even accompanied her half-way into the US [glorymail.com]. And now, she's at your door-step, awating your warm welcome.

Back to techno-babble, now what can I do about it? This one single email shows links to China, United States, India, Croatia, Mexico, and United Kingdom! Clearly the email claims to have originated in Mexico by someone using the web-based SquirrelMail. Unless I'm mistaken, it's saying that a user from colourconfidence.com authenticated with (i.e. logged into) SquirrelMail hosted on glorymail.com servers. In my experience, unless colourconfidence.com and glorymail.com are on the same server (definitely not according to their IPs), it's not possible for SquirrelMail to authenticate. I often log in to SquirrelMail on https://abcd.com using user@xyz.com as long as both abcd.com and xyz.com are hosted on the same server.

Analyzing which servers an email has been through (all the Received: headers) seems pretty much useless because everything except the last one can be faked quite easily. One thing each mail server can do while receiving email is verify if the IP that is sending the email is the same one that received the email in the previous Received: header. E.g., in the above case, is the server that received the supposedly "original" email [glorymail.com] the same as the one that is sending it [60.6.118.44] on to the next one? If no, then either [60.6.118.44] is lying or it is some internally chained server setup that receives emails using [glorymail.com: 69.25.142.7] and forwards it using [60.6.118.44]. I'm sure there are setups like this, but they would normally not be on entirely different IP blocks: [60.6.118.44] vs [69.25.142.7]. With an allowable subnet of 255.255.0.0 between the receiving and forwarding IP, this can in fact work to minmize faking of received headers.

Of course, there's not much spammers gain by faking headers except to confuse a few servers. Once potentially fake headers are regularly blocked, spammers will stop faking and just hit the recipient's server directly, like the next email shows:

Delivery-date: Sat, 15 Jul 2006 14:33:10 -0400
Received: from [88.64.177.85] (helo=S-3QY5813Q55LG1T)
	by server.chime.tv with esmtp
	id 1G1o0v-0005WJ-O7
	for [edited]@chime.tv; Sat, 15 Jul 2006 14:33:09 -0400
From: "Buford" <[edited]@popstar.com>
To: <[edited]@chime.tv>
Subject: Get the freshest But without any results
Date: Sat, 15 Jul 2006 20:36:25 +0200

Yo! Masculine performance has never been so easy to increase with these products. Order our magical stuff now for the amazing prices, and we will dispatch it right away

World famous brands which keep men happy all over the world

See our offer: http://www.sherifidk.com

We thank you for being interested in our products
All I know here with certainty is that [88.64.177.85] sent me that spam. [88.64.177.85] isn't on SBL/XBL or any other major DNS blacklists. I'm sure it will get on a blacklist if the spam continues, but it does no immediate benefit to me as my mailbox already received 14 spams from that IP address.

Spam is such a big problem because SMTP is so simple. With just a few headers (or commands), anyone can send an email to anyone. It's easy to identify spam when it has lots of headers that raise red flags. But when it's as simple as a "Hey let's have lunch at this restaurant: http://URL" email that your friend might send, how can you block it? More and more I'm noticing, there are no spammy keywords. No Viagra or Cialis. Or \/1aGr4 either. It's plain text with simple URLs with real domains in the "From" header.

Any filters to block simple spams as such will only result in blocking of legitimate emails. There are so many ideas on how to block spam but none work for every occassion. E.g. Hold off emails from first-time senders till they click on a URL to verify their authenticity. Great, more work for sender and impossible for automated senders to verify (i.e. emails from online merchants). Now the receiver has to add every automated sender to some list. Of course, spammers can fake the from address to be same as major auto-senders: checkout@amazon.com. Well, now you start verifying if IP of sender that claims to be checkout@amazon.com is same as that of amazon.com [72.21.206.5]. What happens when Amazon migrates servers to new IP block? Everyone in the world, change your filters. Filters, even highly intelligent ones, are not the answer.

Then there's Sender Policy Framework. It is promising though it has its issues. The biggest problem is of course, getting the entire world to use it - making every sysadmin in the world spend an hour and half to set up SASL SMTP is asking for a bit too much. It can happen but not anytime soon. Especially since Microsoft has, as usual, come up with their own way of doing this: Sender ID.

Comparing implementation, adoption, and technical qualities of both these and many more email authentication frameworks is not my intention today. Today, I just wanted to show how globalized spam is, from origination to destination, from point of sale to the hidden beneficiaries. Moreover, it is next to impossible to fix the problem of spam based solely on the content of individual emails. Or by catching the people that make money through spam. As long as the cost of catching spammers is greater than sending spam, it's not going to happen. Just like corruption has to be fixed from outside the system, so has to be done with spam. While the tech luminaries fight over which framework to adopt globally, I keep getting 40 spams each time I hit "Get Mail."

All I want them to do is just pick one so every one in the world can use it. I don't care if we choose the easy but inferior or select the difficult and efficient framework. Just get on it already. Don't pull an RSS vs Atom, HD-DVD vs Blu-Ray, Kari vs Scottie for no good reason.

Can you stop my spam already? kthxbye!