Okay, so we’re going geeky today people, cope. There are two questions I need to spell out for folks before I get into the code bits, and I’ll try to keep this as light-tech as possible. If you run a website, or have a free site, or just want to post your pictures on the web, you need to know this. If you have your own domain, you need to know this. If you post pictures to a bulletin board, you need to know this. Basically, if you use the internet at all, read this. I’ll let you know when you need to stop reading.
Things everyone should know
Things every webmaster should know
Things every ISP should know
Bandwidth
Bandwidth means, for computer users, the data transfer rate, or how much data can be transferred in a given time period. The easiest example here is how you access the net. If you use a modem for dial up (and I feel for you), then you use 14.4, 28.8, 33.6 or 54 kilobytes (kb) per second. To give you an idea how small a kb it, one letter (that is ‘a’ for example) is a kb, roughly. That’s not an exact science, but it’ll give you a rough idea. At 14.4, your email downloads at roughly 14 letters a second. Which is why dial-up sucks. In the world of computers, bigger bandwidth is better. The more bandwidth, the faster you can download the preview of the new Batman movie.
In addition to speed, bandwidth also means how much data you can transfer in a given time period. This website has an allocation of 30 gigabytes of data per month, and we average about 5. My other website has the same allocation and averages 18. If I go over my data transfer for a given month, I can either pay out the nose for extra bandwidth, or I can let the site be shut down till the next month. The reason this is important to know, is if you run a website, every time a page loads, you use bandwidth. On a site like Yahoo! GeoCities (or geoshitties if you’re a nerd like me), you get 3 GB/month. Yeah, you think that’s great, but it really sucks if you want to post things like a blog and people click here a lot. This aspect of bandwidth is the reason why most sites I design are low on the graphics. More graphics means more data transfered means more bandwidth used. In the case of data transfer allocation, bigger sites does not equal better, though bigger bandwidth is king.
Then again, the bigger your site, the longer it takes to download, and the less time it takes for people on 56k to get pissed and tell you that you suck. Finding a webdesign that’s a balance between your dream design and speed is why people like me have jobs.
In summation: Bandwidth controls how fast you can view the net from your home, as well as how much data a website can share with the world each month. Having more bandwidth is better all the time, but forcing users to use more bandwidth with image heavy sites and poorly coded web pages is not cool.
Hotlinking
Hotlinking is putting a link to someone else’s webpage’s graphic on your site. This is also called bandwidth theft. Directly linking to a website’s files (images, video, etc.) means that when someone accesses your website, they draw bandwidth from another. If you use an >IMG< tag to show a picture from someone else’s page on your blog, forum post, or website, that’s hotlinking. You’re stealing their bandwidth.
There is a case in which this sort of ‘theft’ is ethically permissible, though some webhosts don’t like it. If you have multiple Yahoo! sites, and one is low on bandwidth, you can shuttle some of your content to the other site, and thus split up the bandwidth. This isn’t always a good idea, as if it’s against the Terms of Service on your host, they can kill you. Which is why you should always back up your websites on your on computer. If you own your own domains (like I do) and have multiple ‘subdomains,’ then it’s okay to share an image. ipstenu.org is considered a different website that ipstenu.org/blog, so I have to tell my server it’s okay to share between the two. But that’s code geeky.
What the common websurfer needs to know is this: direct linking to a picture, movie file, or any other content on someone else’s site, unless it’s a simple URL link to that site, is bad form, ethically asinine, and impolite. It’s akin to stealing electricity from your neighbor by plugging into their outlets.
In summation: Hotlinking is stealing bandwidth from someone else’s website, and is considered to be unethical.
Things every webmaster should know
Now that you’ve gotten this far, we’re going into heavy geekitude. I have actually once had my site nearly shut down because someone was hotlinking to an image, and I had to figure out how to prevent it. This is the knowledge I share with you.
Hotlink Prevention for Apache
Apache is the de facto webserver for Unix. I don’t like IIS (Windows webserver) and so few people use Netscape’s webserver, I won’t even consider that anymore. Pretty much, I use Apache and if you don’t, I haven’t a clue how to help you.
On Apache (and in theory this works on IIS, but as I said, I don’t use it), there is a file in the root of your html folder called .htaccess. This is an Apache directives file, or a config file, that controls how Apache handles the folders in the same folder as the .htaccess file. Your website has a folder, usually called ‘public_html’. Inside that folder you have things like a file named index.shtml and a folder named cgi-bin. Below is an example of what my webserver’s root public_html folder might look like.
.htaccess blog index.shtml images cgi-bin robots.txt folder1 folder2 foldern
The .htaccess folder controls how the subfolders (blog, cgi-bin, folder1, folder2, and foldern) are handled. If I look at my .htaccess file, and you can open it up in your text editor of choice, I see this at the very bottom:
RewriteEngine on RewriteCond %{HTTP_REFERER} !^$ RewriteCond %{HTTP_REFERER} !^https://ipstenu.org/.*$ [NC] RewriteCond %{HTTP_REFERER} !^http://ipstenu.org/.*$ [NC] RewriteCond %{HTTP_REFERER} !^https://ipstenu.org.*$ [NC] RewriteCond %{HTTP_REFERER} !^http://ipstenu.org.*$ [NC] RewriteRule .*\.(jpg|jpeg|gif|png|bmp)$ - [F,NC]
This means that I’m telling Apache to turn on the mod ‘RewriteEngine’ and to only permit my webpage (the HTTP_REFERER) to access the images. The images I list are in the ‘RewriteRule.’ I could use variables like ‘jp?g’, but I know what the file extensions are for the files on my server, and I cheat that way. If I wanted to be really mean, and didn’t worry so much about my bandwidth, I’d change the last line to RewriteRule .*\.(jpg|jpeg|gif|png|bmp)$ images/nohotlink.gif [L] so that when you try and link to /images/jojo.jpg, you’d get some witty image about how hotlinking is wrong.
I actually do that on my other server, but the gif I use is 2k so it’s not something I worry about. It also makes it easy for me to later go back and see who’s been hitting that particular GIF and find the mean people. Yes, I have been known to send nasty notes to them.
Keep in mind, as with any .htaccess rewrites, you may block some legitimate traffic (such as users behind proxies or privacy firewalls) using these techniques.
Now here’s the big problem. Not all ISPs let you use the Rewrite mod! Half the reason I switched to my current provider was hotlinking (the other was SQL). The rewrite mod (module, don’t you know?) “provides a rule-based rewriting engine to rewrite requested URLs on the fly.” It’s totally magic, and I secretly adore it. It’s complex as fuck, though, and I still don’t really get all that it does. I do know that it works.
You’re a fucking bastard if you don’t let your users use mod_rewrite.
Was that harsh? Sorry, I mean to say ‘You don’t give a rats ass about bandwidth if you don’t let your users use this.’ I’m well aware there are security ‘concerns’ about what mean people can do with it, but let’s face it, if someone’s smart enough to figure out everything you can do with mod_rewrite, then you’re in trouble anyway. There is a performance hit as every request is checked against the rewrite rules, so if you’re running an image intensive site, this can suck. But the trade off between performance and bandwidth are, to me, minimal.
Look, if a user have a website with images, and some dickhead out there is hotlinking to that user’s images, then you, the ISP, have to handle the bandwidth crisis, and the pissy user asking you why he can’t use this feature to stop the dickheads?
And speaking of security, I can’t find any hack for it. So if the fear is ‘really smart, but really evil people utilizing my server for nefarious purposes,’ I think that should be pretty low on the list. I’d put ‘spammer’ and ‘virus distributor’ ahead of it.
Hotlinking can act like a DDoS attack, and if there’s ever a way to prevent it, by G-d, do it! The mod takes five fucking minutes to install.
SimpleNet, I’m looking at you.