How To Use Your .htaccess File To Keep Spammers Out
Spammers have a knack for developing “overrides” to even the most secured aspect of the system including those that are not readily recognized as potential targets. The .htaccess file can be used to keep e-mail harvesters away. This is considered very effective since all of these harvesters get to identify themselves in some way using the user agent files which gives .htaccess the capability to block them. Spams Countered by .
htaccess Bad bots are the spiders that are considered to do a lot more harm than good to a site such as an e-mail harvester. Site rippers are offline browsing programs that a surfer may unleash on a site to crawl and download every one of its pages for offline viewing. Both cases would result to a jacking up a site’s bandwidth and resource usage even up to the point of crashing the site’s server. Since bad bots would typically ignore the wishes of ones’ robots.txtfile they can be banned using the .
htaccess essentially by identifying the bad bots. There is a useful code block that can be inserted into the .htaccess file for blocking a lot of the known bad bots and site rippers currently existing. Affected bots will receive a 403 Forbidden Error when they attempt to view a protected site. This usually results to a significant bandwidth saving and decrease in server resource usage. Bandwidth stealing or what is commonly referred to as hot linking in the web community refers to linking directly to non-HTML objects that are not on one’s own server such as images and CSS files. The victim’s server is robbed of bandwidth and money as the perpetrator enjoys showing content without having to pay for its delivery. Hot linking to one’s own server can be disallowed with the use of .htaccess. Those who will attempt to link an image or CSS file on a protected site is either blocked or served a different content.
Being blocked would usually mean a failed request in the form of a broken image while an example of a different content would be an image of an angry man, presumably to send a clear message to the violators. It is necessary that the mod rewrite is enabled on one’s server in order for this aspect of .htaccess to work. Disabling hot linking of certain file types on a site would need a code to the .htaccess file which will be uploaded to the root directory or a particular subdirectory to localize the effect to just one section of the site. A server is typically set to prevent directory listing. If this is not the case, the required link should be stored into the .htaccess files of the image directory so that nothing in this directory will be allowed to be listed. The .htaccess file is also able to reliably password protect directories on websites.
Other options can be used but only .htaccess offers total security. Anyone wishing to get into the directory must know the password and no “back doors” are provided. Password protection using .htaccess requires adding the approximate links to the .htaccess file in the directory that is being sought to be protected. Password protecting a directory is one of the functions of .htaccess that takes a little more work than the others. This is because a file containing the usernames and passwords which are allowed to access the site has to be created. It is placed anywhere within the website although it is advisable to store it outside the web root so that it cannot be accessed from the web.
Recommended Practices to Deter Spam Avoiding the publication of referrers is one way of discouraging spammers. It would be pointless to bother sending spoofed requests to blogs when this information is not known. Unfortunately, most bloggers believe that being able to click on a link such as “sites referring to me” and the like is a neat feature and have not evaluated its detrimental effect on the whole blogosphere. If publishing referrers is a definite must, there should be a built-in support for a referral spam blacklist and include the page in robots.txt. It specifically tells Googlebot and its relatives not to index the referrer’s page. By doing this, spammers are unable to get the page rank they seek. This would only work however, when referrers are published separately from the rests of the site’s content. The use of rel = “no follow” likewise denies the spammers of their desired page rank at the link-level and not just the page-level using robots.
Zone Labs Articles
Zone Labs Books