I wish pain and disfiguration upon all comment spammers

Jun 8, 2011

I haven't been able to update this blog in quite a while, since I've been spending all my (diminishing) free time fighting comment spammers. In case it's not painfully obvious, this is my first real blog, and I went ahead and added a comment section since, of course, all blogs should have comment sections. I actually got one legitimate comment before the Viagra, Tramadol, and Cialis peddlers found out I had an open comment section and started posting an average of FIFTY comment spam messages per day.

If you're not familiar with the phenomenon of comment spam: any website that allows its readers to post anonymous comments is almost immediately inundated by posts from shady drug pushers who are trying to get people to buy name-brand drugs at "low prices" online. Like the telemarketers of the 90's, I can't beleive that enough people fall for these scams to make them worthwhile, but from the volume of comment spam I'm getting, it must be pretty lucrative.

In a way, I sort of bring this on myself. My hosting provider provides blogging software — it's actually part of what I'm paying for — but rather than take advantage of the professional blogging software that presumedly includes some level of spam filtering, I prefer to do everything myself. They won't let me have SSH access, so I begrudgingly use their Apache server, but if I had full access to the box, I would have compiled and configured my own server, too.

So of course I wrote my own perl scripts and configured my own MySQL DB to store and retrieve comments. I could have created a blog, for not much more money, with TypePad, instead of hosting it myself, but writing scripts is a fun, and interesting, experience — especially for somebody like me, whose only professional programming experience has been in Assembler, C, C++, and (for twelve years now) Java. I like learning new things, and I like building things, so I went ahead and put together my own little content management system.

... and immediately got socked with the reason most people let professional blogging software manage their blogs these days. Man, these comment spammers are relentless. I've been fighting them for weeks. Every morning I log on, and I've gotten between 40 and 50 comments, all from online drug manufacturers. I think their ultimate business model is to drive me crazy to the point where I need Xanax — and believe me, if I do, I know where to get it cheap, without a prescription.

At first I tried blocking their IP addresses. Surely there can't be that many different IP addresses posting comment spam, right? Well, the comment spammers had a good laugh at my attempts to block their IP addresses. I even wrote an administrative CGI script to automate the process of blocking IP addresses I was able to identify conclusively as spammers — These guys were a huge help in identifying the biggest offenders. But, after having blocked 25 individual IP addresses without even putting a dent in the amount of spam I was receiving, I moved onto content filtering.

I started that today; we'll see how that works out for me. In the meanwhile, I'm working on another post about Apache configuration, which is just about ready to publish. Let me know if you've had any experiences with comment spammers and what you were able to do without resorting to Captchas or commercial software.

Add a comment:

Completely off-topic or spam comments will be removed at the discretion of the moderator.

You may preserve formatting (e.g. a code sample) by indenting with four spaces preceding the formatted line(s)

Name: Name is required
Email (will not be displayed publicly):
Comment:
Comment is required
wheels, 2011-07-13
My comment spam runs in cycles. Currently, I'm in the middle of receiving a lot of home/business/credit loan spam, but I've also received drug spam and term paper spam from time to time. I'm on a Wordpress-based platform, so a lot is filtered out for me, but I still get from 1-10 messages a day that I have to approve or disapprove (as if I'd really approve any of it). I've seen articles in the trade journals that estimate that between 70 and 90 percent of all internet traffic is spam-related. Imagine how much faster the net would run if we didn't have to deal with that.
Josh, 2011-07-18
I share your pain, wheels. I thought I had them squashed, but they seem to have found their way back in; now they're spamming my comment section for no reason at all, since they're not even providing links to the products they're trying to sell. Sigh...
sum, 2018-12-03
It's not that slow for it to matter anyway.
Maisyn, 2011-08-09
Begun, the great inetnret education has.
Nephi, 2011-12-07
I feel sastfieid after reading that one.
lean mass, 2013-01-19
What are the laws as to using company logos in blog posts?
winstrol, 2013-02-20
magnificent submit, very informative. I'm wondering why the opposite experts of this sector don't notice this. You should continue your writing. I'm sure, www.commandlinefanatic.com have a huge readers' base already!
Clen, 2013-04-20
You have observed very interesting points ! ps decent website . "I hate music, especially when it's played." by Jimmy Durante. ;) ;)
Robertlig, 2014-02-08
I've been surfing online more than 4 hours today, yet I never found any interesting article like yours. It's pretty worth enough for me. Personally, if all site owners and bloggers made good content as you did, the internet will be much more useful than ever before.
SasQ, 2014-07-12
There's a quick'n'easy trick to cut out most of the comment spam: In your comment form, add some field(s) with a description that they have to be left blank. Then if the script which receives the data from the comment form, simply check if this field is actually blank. If not, it has certainly been sent by a robot, because human being would leave this field blank. Spambots, on the other hand, use to fill all possible fields with some text, just in case, since they usually don't know which one of these fields will turn out to be visible on your blog. You can also hide these redundant fields with CSS stylesheet, by adding some random ID to their tags and referring them from the CSS with a style "display: none". Then they won't show up at all in a CSS-enabled browser, and won't bother your human users at all. Bots, on the other hand, rarely understand CSS, so these additional fields will be accessible to them, and they will fall to this trap by filling them up with their spam. Then you can even automate your IP banning mechanism, by automatically adding to your ban list all the IP addresses which filled up these additional fields.
Josh, 2014-10-21
Well, thanks for the tip! I got a sudden surge of hundreds of spam comments per day last week so I went ahead and followed your advice. So far so good!
My Book

I'm the author of the book "Implementing SSL/TLS Using Cryptography and PKI". Like the title says, this is a from-the-ground-up examination of the SSL protocol that provides security, integrity and privacy to most application-level internet protocols, most notably HTTP. I include the source code to a complete working SSL implementation, including the most popular cryptographic algorithms (DES, 3DES, RC4, AES, RSA, DSA, Diffie-Hellman, HMAC, MD5, SHA-1, SHA-256, and ECC), and show how they all fit together to provide transport-layer security.

My Picture

Joshua Davies

Past Posts