Links: Injections and Rejections
Simon Pitt |
Tuesday 19th January
At the end of the last article we had a look at the ways websites form their web addresses out of words, numbers and other variables. The BBC uses simple numbers while sites like The Guardian use a variation on the title of the article.
It's not long, though, before things start getting complicated. Take Amazon, for example. Let's say I'm looking for a TV. I click on the "Audio, TV & Home Theatre" button, and suddenly my URL is this:
The URI is sending loads of variables here. If I were to send this to a friend they'd have no idea what they'd be looking at. Admittedly, there's some text at the beginning, but after that there's just a load of gobbledegook. It's long, unwieldy and easy to get wrong or break.
There's something else as well. The text at the beginning: "Sound-Vision-Boxes-Cables-Instruments" appears to be useful like the Guardian URIs (I should point out, the Guardian isn't the only website that has discovered how to form human readable URIs). However, this is misleading. Let's take this book as an example: The Image Dissector by Berthold Horn:
There we go, you can see in the URL what's coming up, there's the text, right in the middle. The only thing is that the text there is completely arbitrary. It doesn't effect what's going to be displayed at all. If you delete it all and put a space in the middle, you'll end up on exactly the same page as you would anyway:
In fact, you can even put different text in there, you'll still end up on the same page:
Still going to that same page.
The site isn't using any of that text there at all. The database generates that and puts it there to form the links, but when the page comes to read the URI, it doesn't look at it. It just asks: "what's the variable at the end?" The page tells it the code, in this case "B0007EKRSY", it looks that up in the database and returns the contents.
Once you know a bit about Amazon URIs there are a few other tricks you can do.
Here's an example. If you want to browse toys and games you'll go to this page:
"Node" is the name of a variable referring to a category, and "468292" is the ID of that category.
Bags and Accessories, for example, have an ID of "362353011", so to get to that page you'd swap "468292" for "362353011":
Now, here's the clever bit. Let's say you wanted to see discount toys. What you'd need to know is the name of the variable that refers to the discount. Luckily I do; it's "pct-off" (an abbreviation of percent off).
So, we can now build our own URI. Here's a worked example for you:
It is Little Timmy's birthday, the son of our reserved Uncle Ernie. Obviously, since he's such a reserved character we don't see Ernie or Little Timmy that often, so we don't really want to spend that much money on him. We only want to look at toys and games that have a lot off; let's say between 98% and 99% off (we're only going to this party because of "family politics" anyway. If we had our way, we'd stay at home and watch Heroes).
So, what do we do? We take the URI from earlier:
then we add an ampersand to tell it there's another variable coming. Then we add the variable name. Finally, we set that equal to what we want. Putting that all together we get:
Try it. You'll see that Little Timmy is probably going to be doing a lot of colouring.
Now this works for anything. Amazon don't always display these pages (they link to some when they're trying to market something). These pages are generated dynamically; all you're doing is querying the database. The information is always in the database, so you can always pull it out. Valentine's Day is coming up, so why not get your loved one something discounted. All you need to know is the ID for jewellery is 193716031, stick that into your URI and go to:
Technically, this sort of URI editing is a type of SQL Injection. An SQL Injection is when a user manually edits or changes what is sent to the database to get a different type of result from what is intended. In this case it's harmless, but often it can be used malevolently.
Sometimes databases run commands containing information in a link. For example, with this page:
the database runs a command like this:
It gets the 51 directly the URI, by asking what q is. But say someone changed what q was equal to. Say they changed the address to:
SELECT * FROM Articles WHERE ID='56'
www.imagedissectors.com/article.php?q=56' OR Where ID='50
now, q = "56'OR Where ID='50". The computer automatically puts this into the database and runs:
Incidentally, if you were hoping to hack this site, I should point this won't actually work. The code on this site checks for this sort of thing before running the command. But not all sites do. But here's another example.
SELECT * FROM Articles WHERE ID=' 56' OR Where ID='50'
Say a site had this URI:
and put this into:
If someone edited the URI to say "users" rather than "articles":
SELECT * From 'articles' Where ID='2'
The site would automatically run the command:
And if all the usernames and passwords were stored in a table called "users" they would now be displayed on the screen. Suddenly, just by editing the URI, someone has found out the names and passwords of all the members of your website. Quite worrying if you're a bank.
SELECT * From 'users Where ID='2'
Of course, this is only a simple example, but there are plenty of sites that give you more complex guides if you get stuck trying to break into your bank and steal all your money.
As so often, XKCD has a witty example of this where a mother gives her child an improbable name containing a semi colon just so he will cause an SQL injection when staff enter his name into their database.
This all sounds very theoretical, but it certainly isn't. Just last month, for example, a hacker breached RockYou! and extracted the usernames and passwords of 32 million users using an SQL Injection. In August, the United States Justice Department charged Albert Gonzalez with the theft of 130 million credit card numbers. In 2006 Russian hackers broke into another website and stole credit card information. Wikipedia's long but boring list contains even more cases of hackers editing URIs or other data inputs to extract information. The scale of theft that SQL Injections allow is overwhelming. And all just by editing the web address.
To be continued. Next time, we'll have a look at relative and absolute links, the danger of spaces and brackets, and suggest some reasons why that link someone emailed to you doesn't work.