Log files are a useful tool for webmasters. It helps to know how people are finding your site, and what software they are using to view it, among other things. A strange decision by a small group of bloggers, though, has given unscrupulous marketers another window of opportunity to manipulate search engines to increase their traffic.
The decision made by these short-sighted bloggers was to display, on their site, a list of recent referrers to each page. I can't imagine any reason why a visitor might be in the least bit interested in seeing this, but nevertheless many sites now display referrers on every page.
As search engine spiders visit sites, they grab the contents of each page they visit. They use this snapshot in their index - meaning that although a page may change every minute or two, a search engine may be using a single copy of a page for several days, or even weeks.
So a referral URL that is on a page when the spiders come to visit can have quite a bit of value, if the search engine visiting uses link popularity in any way (Google uses link popularity, as do many others).
So marketers have started to use programs to visit pages using a fake referral header, to get their URLs listed on as many sites as possible, in the hopes that this will increase their traffic.
However, this renders log files almost completely useless. These fake visitors usually visit from search engines, having searched for a keyphrase relevant to their own site. They skew statistics relating to number of visitors received, the countries used to visit, the technology used to view the page, how users found the page, how long they spent on the site ... and so on.
A webmaster may find their search engine rankings dropping because of this, and they may find search engines have removed them completely. Many sites that use spam techniques are quickly identified and penalised, and penalties will often be applied to sites that link to them as well.
There are plenty of techniques available for blocking referrer spam, and everyone has their favourite. Personally, I use a combination of two techniques.
The first is fairly simple - my referrer log is not indexable. I don't display referrers on the pages of my site. My referral log is publicly available, but search engines are instructed to ignore it. This removes the main incentive for people to referrer-spam my site (the other reason for this type of spam - the hope that the site owner will themselves visit the spamming URL - is less common, because it has such a low response rate).
Second, I use an .htaccess file to block requests from whatever I've managed to identify as either a crawler designed to find URLs to spam or a spamming URL. This is a relatively simple blacklist, and though it cannot work as a long term solution to this problem, it keeps me happy for now.
To implement this technique on your own site, first make sure you are running Apache with mod_rewrite. If you are, create a file called ".htaccess" (just that, not .htaccess.txt or anything else) and paste the following into it:
Update: 14th September 2005
The list below has been expanded substantially over the last year, and now covers much more spam than before. As stated before, this is not a practical solution to the problem in the long term, as this list can only ever get longer and longer, and may become unmaintainable, or even (eventually) slow a site to a crawl as all the rules are processed. However, as of now, it is still a useful tool.
RewriteEngine on
# Block Referrer Spam
# Drugs / Herbal
RewriteCond %{HTTP_REFERER} (sleep-?deprivation) [NC,OR]
RewriteCond %{HTTP_REFERER} (sleep-?disorders) [NC,OR]
RewriteCond %{HTTP_REFERER} (insomnia) [NC,OR]
RewriteCond %{HTTP_REFERER} (phentermine) [NC,OR]
RewriteCond %{HTTP_REFERER} (phentemine) [NC,OR]
RewriteCond %{HTTP_REFERER} (vicodin) [NC,OR]
RewriteCond %{HTTP_REFERER} (hydrocodone) [NC,OR]
RewriteCond %{HTTP_REFERER} (levitra) [NC,OR]
RewriteCond %{HTTP_REFERER} (hgh-) [NC,OR]
RewriteCond %{HTTP_REFERER} (-hgh) [NC,OR]
RewriteCond %{HTTP_REFERER} (ultram-) [NC,OR]
RewriteCond %{HTTP_REFERER} (-ultram) [NC,OR]
RewriteCond %{HTTP_REFERER} (cialis) [NC,OR]
RewriteCond %{HTTP_REFERER} (soma-) [NC,OR]
RewriteCond %{HTTP_REFERER} (-soma) [NC,OR]
RewriteCond %{HTTP_REFERER} (diazepam) [NC,OR]
RewriteCond %{HTTP_REFERER} (gabapentin) [NC,OR]
RewriteCond %{HTTP_REFERER} (celebrex) [NC,OR]
RewriteCond %{HTTP_REFERER} (viagra) [NC,OR]
RewriteCond %{HTTP_REFERER} (fioricet) [NC,OR]
RewriteCond %{HTTP_REFERER} (ambien) [NC,OR]
RewriteCond %{HTTP_REFERER} (valium) [NC,OR]
RewriteCond %{HTTP_REFERER} (zoloft) [NC,OR]
RewriteCond %{HTTP_REFERER} (finasteride) [NC,OR]
RewriteCond %{HTTP_REFERER} (lamisil) [NC,OR]
RewriteCond %{HTTP_REFERER} (meridia) [NC,OR]
RewriteCond %{HTTP_REFERER} (allegra) [NC,OR]
RewriteCond %{HTTP_REFERER} (diflucan) [NC,OR]
RewriteCond %{HTTP_REFERER} (zovirax) [NC,OR]
RewriteCond %{HTTP_REFERER} (valtrex) [NC,OR]
RewriteCond %{HTTP_REFERER} (lipitor) [NC,OR]
RewriteCond %{HTTP_REFERER} (proscar) [NC,OR]
RewriteCond %{HTTP_REFERER} (acyclovir) [NC,OR]
RewriteCond %{HTTP_REFERER} (sildenafil) [NC,OR]
RewriteCond %{HTTP_REFERER} (tadalafil) [NC,OR]
RewriteCond %{HTTP_REFERER} (xenical) [NC,OR]
RewriteCond %{HTTP_REFERER} (melatonin) [NC,OR]
RewriteCond %{HTTP_REFERER} (xanax) [NC,OR]
RewriteCond %{HTTP_REFERER} (herbal) [NC,OR]
RewriteCond %{HTTP_REFERER} (drugs) [NC,OR]
RewriteCond %{HTTP_REFERER} (lortab) [NC,OR]
RewriteCond %{HTTP_REFERER} (adipex) [NC,OR]
RewriteCond %{HTTP_REFERER} (propecia) [NC,OR]
RewriteCond %{HTTP_REFERER} (carisoprodol) [NC,OR]
RewriteCond %{HTTP_REFERER} (tramadol) [NC]
RewriteRule .* - [F]
# Porn
RewriteCond %{HTTP_REFERER} (porno) [NC,OR]
RewriteCond %{HTTP_REFERER} (shemale) [NC,OR]
RewriteCond %{HTTP_REFERER} (gangbang) [NC,OR]
RewriteCond %{HTTP_REFERER} (-cock) [NC,OR]
RewriteCond %{HTTP_REFERER} (-anal) [NC,OR]
RewriteCond %{HTTP_REFERER} (-orgy) [NC,OR]
RewriteCond %{HTTP_REFERER} (cock-) [NC,OR]
RewriteCond %{HTTP_REFERER} (anal-) [NC,OR]
RewriteCond %{HTTP_REFERER} (orgy-) [NC,OR]
RewriteCond %{HTTP_REFERER} (singles-?christian) [NC,OR]
RewriteCond %{HTTP_REFERER} (dating-?christian) [NC,OR]
RewriteCond %{HTTP_REFERER} (cumeating) [NC,OR]
RewriteCond %{HTTP_REFERER} (cream-?pies) [NC,OR]
RewriteCond %{HTTP_REFERER} (cumsucking) [NC,OR]
RewriteCond %{HTTP_REFERER} (cumswapping) [NC,OR]
RewriteCond %{HTTP_REFERER} (cumfilled) [NC,OR]
RewriteCond %{HTTP_REFERER} (cumdripping) [NC,OR]
RewriteCond %{HTTP_REFERER} (krankenversicherung) [NC,OR]
RewriteCond %{HTTP_REFERER} (cumpussy) [NC,OR]
RewriteCond %{HTTP_REFERER} (suckingcum) [NC,OR]
RewriteCond %{HTTP_REFERER} (drippingcum) [NC,OR]
RewriteCond %{HTTP_REFERER} (pussycum) [NC,OR]
RewriteCond %{HTTP_REFERER} (swappingcum) [NC,OR]
RewriteCond %{HTTP_REFERER} (eatingcum) [NC,OR]
RewriteCond %{HTTP_REFERER} (cum-) [NC,OR]
RewriteCond %{HTTP_REFERER} (-cum) [NC,OR]
RewriteCond %{HTTP_REFERER} (sperm) [NC,OR]
RewriteCond %{HTTP_REFERER} (christian-?dating) [NC,OR]
RewriteCond %{HTTP_REFERER} (jewish-?singles) [NC,OR]
RewriteCond %{HTTP_REFERER} (sex-?meetings) [NC,OR]
RewriteCond %{HTTP_REFERER} (swinging) [NC,OR]
RewriteCond %{HTTP_REFERER} (swingers) [NC,OR]
RewriteCond %{HTTP_REFERER} (personals) [NC,OR]
RewriteCond %{HTTP_REFERER} (sleeping) [NC,OR]
RewriteCond %{HTTP_REFERER} (libido) [NC,OR]
RewriteCond %{HTTP_REFERER} (grannies) [NC,OR]
RewriteCond %{HTTP_REFERER} (mature) [NC,OR]
RewriteCond %{HTTP_REFERER} (enhancement) [NC,OR]
RewriteCond %{HTTP_REFERER} (sexual) [NC,OR]
RewriteCond %{HTTP_REFERER} (gay-?teen) [NC,OR]
RewriteCond %{HTTP_REFERER} (teen-?chat) [NC,OR]
RewriteCond %{HTTP_REFERER} (gay-?chat) [NC,OR]
RewriteCond %{HTTP_REFERER} (adult-?finder) [NC,OR]
RewriteCond %{HTTP_REFERER} (adult-?friend) [NC,OR]
RewriteCond %{HTTP_REFERER} (friend-?finder) [NC,OR]
RewriteCond %{HTTP_REFERER} (friend-?adult) [NC,OR]
RewriteCond %{HTTP_REFERER} (finder-?adult) [NC,OR]
RewriteCond %{HTTP_REFERER} (finder-?friend) [NC,OR]
RewriteCond %{HTTP_REFERER} (discrete-?encounters) [NC,OR]
RewriteCond %{HTTP_REFERER} (cheating-?wives) [NC,OR]
RewriteCond %{HTTP_REFERER} (housewives) [NC,OR]
RewriteCond %{HTTP_REFERER} (\-sex\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (xxx) [NC,OR]
RewriteCond %{HTTP_REFERER} (snowballing) [NC]
RewriteRule .* - [F]
# Weight
RewriteCond %{HTTP_REFERER} (fat-) [NC,OR]
RewriteCond %{HTTP_REFERER} (-fat) [NC,OR]
RewriteCond %{HTTP_REFERER} (diet) [NC,OR]
RewriteCond %{HTTP_REFERER} (pills) [NC,OR]
RewriteCond %{HTTP_REFERER} (weight) [NC,OR]
RewriteCond %{HTTP_REFERER} (supplement) [NC]
RewriteRule .* - [F]
# Gambling
RewriteCond %{HTTP_REFERER} (texas-?hold-?em) [NC,OR]
RewriteCond %{HTTP_REFERER} (poker) [NC,OR]
RewriteCond %{HTTP_REFERER} (casino) [NC,OR]
RewriteCond %{HTTP_REFERER} (blackjack) [NC]
RewriteRule .* - [F]
# Loans / Finance
RewriteCond %{HTTP_REFERER} (mortgage) [NC,OR]
RewriteCond %{HTTP_REFERER} (refinancing) [NC,OR]
RewriteCond %{HTTP_REFERER} (cash-?advance) [NC,OR]
RewriteCond %{HTTP_REFERER} (cash-?money) [NC,OR]
RewriteCond %{HTTP_REFERER} (pay-?day) [NC]
RewriteRule .* - [F]
# User Agents
RewriteCond %{HTTP_USER_AGENT} (Program\ Shareware|Fetch\ API\ Request) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (Microsoft\ URL\ Control) [NC]
RewriteRule .* - [F]
# Misc / Specific Sites
RewriteCond %{HTTP_REFERER} (netwasgroup\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (nic4u\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (wear4u\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (foxmediasolutions\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (liveplanets\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (aeterna-tech\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (continentaltirebowl\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (chemsymphony\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (infolibria\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (globaleducationeurope\.net) [NC,OR]
RewriteCond %{HTTP_REFERER} (soma\.125mb\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (mitglied\.lycos\.de) [NC,OR]
RewriteCond %{HTTP_REFERER} (foxmediasolutions\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (jroundup\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (feathersandfurvanlines\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (conecrusher\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (sbj-broadcasting\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (edthompson\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (codychesnutt\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (artsmallforsenate\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (axionfootwear\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (protzonbeer\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (candiria\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (bigsitecity\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (coresat\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (istarthere\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (amateurvoetbal\.net) [NC,OR]
RewriteCond %{HTTP_REFERER} (alleghanyeda\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (xadulthosting\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (datashaping\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (zick\.biz) [NC,OR]
RewriteCond %{HTTP_REFERER} (newprinceton\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (dvdsqueeze\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (xopy\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (webdevboard\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (devaddict\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (eaton-inc\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (whiteguysgroup\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (guestbookz\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (webdevsquare\.com) [NC,OR]
RewriteCond %{HTTP_REFERER} (indfx\.net) [NC,OR]
RewriteCond %{HTTP_REFERER} (snap\.to) [NC,OR]
RewriteCond %{HTTP_REFERER} (2y\.net) [NC,OR]
RewriteCond %{HTTP_REFERER} (astromagia\.info) [NC,OR]
RewriteCond %{HTTP_REFERER} (free-?sms) [NC]
RewriteRule .* - [F]
The above will block just about all of the most common referral spam that I've seen so far. I'm adding to the list constantly (last addition: 14th September 2005) so do check back and see if there are updates if you're using it.
One potential problem with this technique, other than that it will, in time, become useless as too many URLs are added, is that there is always a possibility authentic visitors will be blocked. So, on this site, instead of the last line above, I've actually used something a little more user-friendly:
RewriteRule .* bad_referrer.php [L]
Instead of a "Forbidden" message, this displays a quick note explaining why there has been an error and that the user can click on a link to proceed. If you want to check this out for yourself, try visiting http://www.addedbytes.com/swingers/block-referrer-spam/ (note the "swingers" portion of the URL). This page will reload with a new URL. Then try visiting http://www.addedbytes.com/spam/block-referrer-spam/. You should find you get a message explaining what has happened, and a URL to click if you want to proceed.
And there we have it. With minimum effort (for now), referral log spamming in my site has been almost entirely removed. Before adding this set of rules and scripts, I was seeing around 200 fake referrals per day in my log files. Now, I see about 3 or 4 a week. Hopefully, this will continue until I can devise a better way of protecting against this kind of problem - before blacklists become an impossibility to manage.









41 Comments
[quote]If you want to check this out for yourself, try visiting http://www.addedbytes.com/swingers/block-referrer-spam/ (note the "swingers" portion of the URL). This page will reload with a new URL. Then try visiting http://www.addedbytes.com/spam/block-referrer-spam/. [/quote]
Both URLs just loaded the page without any additional message for me. Any idea why?
#1, Keith, United Kingdom, 7 January 2005. Reply to this.
[quote]If you want to check this out for yourself, try visiting http://www.addedbytes.com/swingers/block-referrer-spam/ (note the "swingers" portion of the URL). This page will reload with a new URL. Then try visiting http://www.addedbytes.com/spam/block-referrer-spam/. [/quote]
Both URLs just loaded the page without any additional message for me. Any idea why?
#2, Keith, United Kingdom, 7 January 2005. Reply to this.
Because I worded it badly, I think - you need to click on the relevant link on each page. So from the article, you click on the "swingers" link, then from the "swingers" page, you click the "spam" link again.
You may also not be sending a referrer header, which will stop it working.
#3, Dave Child, United Kingdom, 7 January 2005. Reply to this.
block requests that do not show a link to your page from the referrer page
#4, charon, Slovakia, 20 January 2005. Reply to this.
It would be really swell if you'd post the code for bad_referrer.php.
#5, Michael Ditto, United States, 1 February 2005. Reply to this.
I've noticed that 95% of the remaining referrer spam coming into my system had one thing in common: they were all looking for the file /adserver/campaign.php (which doesn't exist).
I've developed the following Apache mod_rewrite rule that seems to be quite effective, and just in time for the February log changeover:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} (adserver/campaign.php) [NC]
RewriteCond %{HTTP_REFERER} !=""
RewriteRule ^(.*) %{HTTP_REFERER} [R=301,L]
In plain English, what this says is that if the file requested contains "adserver/campaign.php", and that file doesn't exist on your server either as a file or a directory, and a referrer is set, redirect back to the referrer. Otherwise, proceed normally.
#6, Zed Pobre, United States, 1 February 2005. Reply to this.
Michael, the code for bad_refferer.php is pretty basic:
echo '<strong>Bad Referrer</strong><br><br>Unfortunately, the URL you have visited from appears to be blocked from referring visitors to this site. But don\'t panic! The chances are that if you are a real person this was a mistake by the filtering system. If you want to carry on to the page you were trying to visit, please <a href="http://' . $_SERVER["HTTP_HOST"].$_SERVER["REQUEST_URI"] . '">click here</a>.';
Thanks for the addition, Zed. That looks like a pretty useful rule to deal with that sort of thing, and could easily be changed to block those requests for "_vti_bin" urls, etc.
#7, Dave Child, United Kingdom, 1 February 2005. Reply to this.
Why do you not offer a DVD image to the spammers?
http://ftp.uni-erlangen.de/pub/debian-cd/images/3.0_r3-dvd/ia64/debian-30r3-dvd-ia64-binary-1_NONUS.iso
#8, Marc, Germany, 7 February 2005. Reply to this.
Someone still has to pay for that bandwidth. And there's no reason to suspect they've not got a basic timeout written into the scripts that do the spamming.
Simply blocking them means they are invisible to me - they don't irritate me, and there's the minimum of inconvenience to the users. Which is exactly what I want ...
#9, Dave Child, United Kingdom, 8 February 2005. Reply to this.
Somehow I can't get the bad_referrer.php to work. My server keeps giving me a "Bad request" 400 error.
I'm doing the blocking server-wide, so not from .htaccess, but from a separate block.conf in the conf.d directory of Apache 2.*
Do you have an idea on how to do this?
I tried an Alias, but the rewrite rules probably assume it can't be done just like that. Could you explain your code of bad-referrer.php a little more?
#10, Julius, Netherlands, 18 March 2005. Reply to this.
I'm getting a 400 error too... ah well ill keep on trucking.
#11, chris, United Kingdom, 18 May 2005. Reply to this.
what about not whowing the statistics altogether? put the referring page to a title or php-generate a GIF image with the referrer as a text in the image or simply password protect your stats.
#12, krestania, Slovakia, 23 May 2005. Reply to this.
Thanks for the tips. Worked well. poker-4all.com keeps pingback spamming and refer spamming me...
#13, Tyler, United States, 6 July 2005. Reply to this.
Thanks. One question: Does the speed suffer, when .htaccess is very large?
#14, Dirk, Germany, 7 September 2005. Reply to this.
I found several words that were very common in a large amount of my referer spam. I decided to eliminate a chunk of the individual sites by blocking on that word. There is the risk that I will block someone legitimate but I am willing to take the chance to save myself from the annoyance.
<code>
RewriteCond %{HTTP_REFERER} (poker) [NC,OR]
RewriteCond %{HTTP_REFERER} (casino) [NC,OR]
RewriteCond %{HTTP_REFERER} (pharmacy) [NC,OR]
RewriteCond %{HTTP_REFERER} (inkjet) [NC,OR]
RewriteCond %{HTTP_REFERER} (blackjack) [NC,OR]
RewriteCond %{HTTP_REFERER} (diet) [NC,OR]
RewriteCond %{HTTP_REFERER} (drugs) [NC,OR]
RewriteCond %{HTTP_REFERER} (holdem) [NC,OR]
RewriteCond %{HTTP_REFERER} (mortgage) [NC,OR]
RewriteCond %{HTTP_REFERER} (loan) [NC,OR]
</code>
I did attempt to be selective of what words I did this with. I don't think many sites have inkjet in their url. Could be wrong but again it's worth the chance.
#15, Tim, United States, 14 September 2005. Reply to this.
If anyone's interested, here's my listing;
http://jult.net/txt/blocks
I only use lists this size serverwide, not thru htaccess (that wat it's a load/CPU monster)
#16, Julius B. Thyssen, Netherlands, 24 September 2005. Reply to this.
(that way it indeed is a load/CPU monster)
#17, Julius B. Thyssen, Netherlands, 24 September 2005. Reply to this.
Does an excessive large .htaccess increase your bandwith usage? My site is down because of referrer-spam flooding. Does this solve thisproblem ?
Thanks for the works! great!
#18, Struikel, Netherlands, 17 October 2005. Reply to this.
It won't increase your bandwidth but will make your site a little slower.
#19, Dave Child, United Kingdom, 17 October 2005. Reply to this.
Thanks. It seems to work great. Is reflecting the spam to the referrer like this:
RewriteRule (.*) ^http://%{HTTP_REFERER}/$ [R=301,L]
A smart idea? So the referrers get the amount of data?
#20, Struikel, Netherlands, 18 October 2005. Reply to this.
Struikel: It might work, however I've no idea if referrer spam bots are able to support 301 redirection (in fact, it's probably a good idea to test this - if they are unable to handle 301s, that would mean we could use that to filter bots from users.
If they did support 301s, redirecting the spam traffic back to the person responsible might well be a good idea.
#21, Dave Child, United Kingdom, 18 October 2005. Reply to this.
Thanks for a good idea of blocking, but it works only partially for me. Even if i have words like "holdem" included in my blacklist, mod_rewrite passes through about one half of referrers. Shame...
#22, Finwe, Czech Republic, 21 October 2005. Reply to this.
ILoveJackDaniel, it seems the "buy viagra online"-Guy has managed to bypass your checks. he he he :)
#23, Jimmy Flirten, Germany, 26 October 2005. Reply to this.
Hi Jimmy.
The problem with posting techniques for dealing with spam is that you tend to become a target for it. I get the occasional piece of comment spam junk like the above (now deleted) and lots of referrer spam that's usually blocked.
#24, Dave Child, United Kingdom, 27 October 2005. Reply to this.
Man, I'm having some serious problems getting this to work. I've tried everything, and it just won't do it.
My current .htaccess looks like this:
-------------
DirectoryIndex index.html index.htm index.php index.shtml
AddType application/x-httpd-php .html .htm
RewriteEngine On
addhandler server-parsed html sssi page shtml htm
<limit GET POST PUT>
order deny,allow
deny from netwasgroup.com
deny from nic4u.com
deny from foto-porno-amatoriale.com
deny from video-porno-anale.com
deny from sborra-sopra-piedi.com
deny from piedi-feticismo.com
deny from sesso-vero-amatoriale.com
deny from sborrate-in-faccia.com
deny from lesbiche-sesso.net
deny from sesso-orale-gratis.net
deny from lesbiche-sesso.net
deny from goodcounter.net
deny from sborra-sopra-piedi.com
allow from all
</limit>
----------
But none of those sites listed are being blocked.
Can anyone help?
#25, Martin, United Kingdom, 14 November 2005. Reply to this.
Can this be added to a current htaccess list I have at the bottom of it. As I dont want it to affect the current IP's that i block.
#26, Darrel, United States, 23 November 2005. Reply to this.
Darrel: Yes, as long as you don't repeat the "RewriteEngine on" bit.
#27, Dave Child, United Kingdom, 23 November 2005. Reply to this.
Thanks for the rewrite rules!
#28, Jack, United States, 1 December 2005. Reply to this.
Hi-
This all looks fine, and I've been working on this for days...to block these referer spammers.
But -- somehow, I'm not able to stop logging the fact that these spammers are getting "301", and my logs are filling as before.
(Trying to block a non-existent refer/index.php)
What am I missing?
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} (/refer/index.php) [NC,OR]
RewriteCond %{REQUEST_URI} (/refer) [NC]
RewriteCond %{HTTP_REFERER} !=""
RewriteRule ^(.*) %{HTTP_REFERER} [R=301,L]
#29, Rick, United States, 5 December 2005. Reply to this.
For the last month, our site has been getting pulverized with referral spam. We have a blog which is highly ranked in search engines. As a result, we were getting attacked by spammers to the point of over 7000 hits per day by each spammer. Nearly crippled our server. After doing some research and scouring the net, I found this site. THANK YOU SO MUCH! We are now using .htaccess to filter out the junk. It works like a champ! For the first time in several months our server is fast and we have much better control over referral spam. It took a while to get it to dial our .htaccess file in but well worth the time. Thanks again for your great resource.
#30, Adam Berman, United States, 16 January 2006. Reply to this.
My daughter maintained her robotics blog on my site (http://synysys.com/roboblog) for several months until the project came to conclusion and was eventually taken off line. During it's life span, it fell to a bit of neglect and became the target of the referrer spam bots. Today our site still gets over a thousand hits per day looking for the old exploit.
I know this is a bit like closing the barn door after the cows have all run off and perhaps in this case even part of the barn burned down, but I thought I'd share the solution I hacked together today. I wasn't satisfied with a purely mod_rewrite solution since as others have noted, you still get a one line log entry in you access log. Essentially my solution is a two pronged approach. First it uses mod_rewrite to redirect the spammer back to their own machine. Second it puts a DROP entry in my firewall so that they won't be coming back to visit again any time soon. That way my logs aren't filling up with the same old rewrites over and over.
The entry in httpd.conf looks like this
RewriteEngine on
RewriteCond %{QUERY_STRING} disp=stats
RewriteMap referer-deny prg:/etc/httpd/refererdeny.pl
RewriteRule ^(.*)$ ${referer-deny:%
{REMOTE_ADDR}}/BITE_ME_SPAMMER? [R,L]
In my case it was a particular query string that typified the bulk of the spam traffic, but you can add other patterns to the above rewrite conditions to suit your own needs.
The PERL script looks like this
#!/usr/bin/perl
$| = 1; # Turn off buffering
while (<STDIN>) {
print "HTTP://",$_;
$b = ("/sbin/iptables -A INPUT -j DROP -p tcp --destination-port 80 -s $_");
system ($b);
open (OUTFILE,">>/etc/httpd/referer.deny");
print OUTFILE ("$b");
close (OUTFILE);
}
The referer.deny "log" looks like this
/sbin/iptables -A INPUT -j DROP -p tcp --destination-port 80 -s 219.93.21.20
/sbin/iptables -A INPUT -j DROP -p tcp --destination-port 80 -s 220.165.140.8
/sbin/iptables -A INPUT -j DROP -p tcp --destination-port 80 -s 83.100.149.29
So you could easily add #!/bin/sh to the head of it and run it as a shell script separately if you wanted. However, before you do that, you should probably sort the file and remove any duplicate entries that may have crept in. I have chosen to only block access to port 80. You could easily add port 25 or even remove the destination-port all together and completely block them from your site. Just be aware that some clever fool could forge your IP and potentially block you from your own site. Of course you could reduce the output to the log file by substituting $_ for $b and just end up with a list of blocked IPs.
I realize that not all site admins have root access to be able to run the firewall commands, so you might modify this to update a hosts.deny file that you've defined in your own .htaccess configuration. The point is you don't really want to have to manually enter every IP or host name if you are really getting bombarded. Again if you do this, you'll probably want to sort the file and remove dupes. I'd also recommend that us the DB utility to speed your lookups if you end up with a significant number of blocked hosts. You really don't want to bog down your site with lookups on account of these spammer fools.
Of course one of the problems that I alluded to earlier is that you may end up with unwanted blocks defined in your system. Most hosting environments offer CRON access. You might choose to flush the firewall rules over some period. Many of the spammers are running client based tools from dynamic IP pools on the ISP. Over time you could end up blocking a significant number of IPs that were only used once against your system. Since this system is automated, it's probably safer to clear it out periodically and let it repopulate itself with the bad apples that keep coming back.
I hope this helps someone. It seems to be working wonderfully for our site. My daughter's robotics project was archived as a PDF for those who are looking for it and the spammers trying to exploit the referrer logs aren't stealing my bandwidth or chewing up file space with senseless logs any longer.
Bruce
#31, Bruce MacKay, Canada, 23 January 2006. Reply to this.
Thx ! was really useful !!! i had like 100 referrr spam daily. !!!
#32, Srinath, India, 9 February 2006. Reply to this.
I've been running a somewhat modified system to what I documented in comment #31. It's currently blocking 3208 IPs from people who have behaved badly on my system (mostly attempts at referrer spam, but that number also includes SSHD and misc script probes against my HTTPD) The firewall easily handles this many blocks and my web server is much happier with the reduction in load.
The system responds in real time to these attacks, gives them a custom 403 Error page and then blocks their IP. The custom 403 Error page is for non-script users who may be blocked inadvertently. It has a link to a recovery system which unblocks their IP and restores their access. Of course a bot doesn't follow the on-screen instructions and even if it did, it would just get blocked when it started behaving badly. All in all, it seems to be quite an effective solution.
For folks who don't have admin access to the firewall
on their server, the system is still quite effective, but you will continue to see all of the attempts in your httpd logs.
If anyone is interested in further details, I'd be more than happy to discuss this via chat or email. You can contact me on www.synysys.com. Anything that we can do to slow these idiots down is a step in the right direction.
Bruce
#33, Bruce, Canada, 4 March 2006. Reply to this.
But still, how are you gonna catch ref.spam with just a list of descriptions? It's an endless road I'm travelling, and I'm getting quite fed up with these idiots doing this.
# the biggest losers ever ( they can't even spell: )
RewriteCond %{HTTP_REFERER} (-nude\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (abrianna\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (amanti\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (anali\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (burdizzo\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (bucetinhas\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (calcinha\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (cogidas\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (esibizioniste\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (fimosis\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (folladas\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (gotico\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (gozadas\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (hargitay\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (loredana\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (mamando\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (minifalda\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (plumprumps\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (porono\.) [NC,OR]
RewriteCond %{HTTP_REFERER} (ramalan\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (stretched\.org) [NC,OR]
RewriteCond %{HTTP_REFERER} (subsonica\.org) [NC,OR]
#34, Julius, Netherlands, 18 March 2006. Reply to this.
Dammit, I really hate spam.
#35, CortalUX, United Kingdom, 7 May 2006. Reply to this.
Hey folks, sorry for the spam here but i'm wondering how you can tell if this is working? I found a couple sites referring to different ways to do what you're saying. One is a .conf file in /path/to/conf.d/ where it works w/ apache and other is .htaccess. What i'm seeing is a ton of referrer spam in our access logs which doesn't belong. It's forged as it's asking for sites that our apache server doesn't host. I would like to get this out of my log file and put the kabosh on the spamming offender by simply blocking their access (if possible). Any suggestions here?
#36, Rob, United States, 27 May 2006. Reply to this.
I though some folks might like to take a look at this site for referrer spam:
http://unknowngenius.com/blog/wordpress/ref-karma/
he wrote a neat php script to automate updating referrer blocks.
#37, Rob, United States, 8 June 2006. Reply to this.
I had a wrong link coming in from a wrong place. This technique was the cure.
#38, Karma Debugger, Canada, 19 June 2006. Reply to this.
Sentimental and nostalgic. Great.
#39, Vasilii, Japan, 2 October 2006. Reply to this.
I geta TON of spams on my FAQ pages under comments. I have tried to implement this on my site but I can't seeing as how I use vhosts. Any way around this?
#40, Bill, Unknown, 31 October 2006. Reply to this.
I've recently been getting a ton of referral spam. I've taken a bit of a different approach. I've implemented a little php, that does two things. The first, is check for a cookie. If that cookie doesn't exist, I set it. I then set another cookie with with a special value for reference, such as part of my sites domain name. The second check, is for the referrer itself. If its empty, I don't load the rest of the page and give them a chance to fix it, then click a link that will ensure the referrer came from my own site. If they spoof the referrer, they end up in an endless cycle of click here, etc.
The main problem I see with todays referrer spam, is SEO Blackhats don't care about getting it in your logs. They care about getting in into google analytics. This ensures them that google will harvest their links and search terms that came to your site. Asside from email and comment spamming, I see referral spamming as the next biggest threat to website.
With my php version, you can also do a count added to your cookies. If they reach a certain threshold, you can header location them off your site, either to the referral they try to spam at you, or wherever you want. ;)
#41, DigiP, 17 August 2011. Reply to this.