Blog: Google CTF 2018 - JS Safe 2.0

Last week, Google launched their third annual Capture The Flag competition, filled with challenges ranging from web security to cryptography. This is the first CTF I've joined, and it was quite an interesting experience. While I'm nowhere near qualified to compete, I barely scraped by solving one problem, and it took me the entire two days of the event. Overall, I had fun untangling some obfuscated JavaScript, and learned quite a bit about a few intricacies of this crazy language.

The Safe

The challenge consists of only one file and the following prompt:

You stumbled upon someone's "JS Safe" on the web. It's a simple HTML file that can store secrets in the browser's localStorage. This means that you won't be able to extract any secret from it (the secrets are on the computer of the owner), but it looks like it was hand-crafted to work only with the password of the owner...

Attachment

(Feel free to try out the challenge before reading!)

The file is an HTML file that displays a rotating safe with a password input box. When a password is entered, the words "Access Granted" or "Access Denied" will appear below the safe, depending on if the correct password was entered.

The value of the input box is matched with a simple regex:

password = /^CTF{([0-9a-zA-Z_@!?-]+)}$/.exec(keyhole.value);
if (!password || !x(password[1])) return (document.body.className = 'denied');

This regex checks for a string in the form of CTF{ ... }, which validates the format of the flag. The regex appears on the CTF website for submitting a flag, so I assumed there's no getting around that.

RegExp.exec returns null if the string doesn't match the regex. Otherwise, it returns an array in this format:

Index Description
[0] The matched string (keyhole.value)
[1], ..., [n] The n'th parenthesis match (the string inside CTF{ ... })

Since there's only one parenthesized group in the regex, password[1] is the flag itself. So, in order to pass that check, our password needs be in the form CTF{ ... } and has to return a truthy value for x(password).

Deciphering x()

function x(х) {
  ord = Function.prototype.call.bind(''.charCodeAt);
  chr = String.fromCharCode;
  str = String;
  function h(s) {
    for (i = 0; i != s.length; i++) {
      a = ((typeof a == 'undefined' ? 1 : a) + ord(str(s[i]))) % 65521;
      b = ((typeof b == 'undefined' ? 0 : b) + a) % 65521;
    }
    return chr(b >> 8) + chr(b & 0xff) + chr(a >> 8) + chr(a & 0xff);
  }
  function c(a, b, c) {
    for (i = 0; i != a.length; i++)
      c = (c || '') + chr(ord(str(a[i])) ^ ord(str(b[i % b.length])));
    return c;
  }
  for (a = 0; a != 1000; a++) debugger;
  x = h(str(x));
  source = /Ӈ#7ùª9¨M¤ŸÀ.áÔ¥6¦¨¹.ÿÓÂ.։£JºÓ¹WþʖmãÖÚG¤…¢dÈ9&òªћ#³­1᧨/;
  source.toString = function() {
    return c(source, x);
  };
  try {
    console.log('debug', source);
    with (source) return eval('eval(c(source,x))');
  } catch (e) {}
}

There are a several interesting things to notice at first:

  • The debugger for loop
  • x = h(str(x))
  • The regex source has a .toString() override
  • The "debug" console.log
  • The with keyword
  • The eval inside an eval

Deciphering h(str(x))

My first plan was to run the safe with Chrome's debugger and adding some breakpoints. I'm interested in the value of h(str(x)) as I might get some clues as to what the input becomes.

The debugger line would clog up the script when I have the developer tools open, so I removed that line. Then, I added a breakpoint on the h(str(x)) line and ran the safe, with password CTF{foo}.

x = 'öª³';

I refreshed the page, and tried again with password CTF{bar}

x = 'öª³';

In fact, any password resulted in the same value for x. How is that possible? I checked the value of x before it got overridden:

x = '[Function: x]';

Subtlety #1: If a function and its argument have the same name, the argument will be used inside the function's scope.

The variable x was a reference to the function itself, not the argument! But why?

Subtlety #2: Javascript's source code is treated as a sequence of Unicode characters, so identifiers can contain Unicode characters.

Turns out, the argument х is really the Cyrllic letter х, not the English letter x. So, Javascript interprets them as completely different identifiers.

The easiest way to notice this is to find all instances of the character x (function name) in the source code. (Try searching for x on this page in your browser.) The argument х isn't selected! Another thing to notice is the argument isn't (directly) used in the function. So the value of h(str(x)) remains the same for any password.

When converting a function into a string, the string representation keeps the whitespace of the original function. This means to get the real value of h(str(x)), we need to run x() in its unmodified, one-liner state, including the debugger for loop.

I did it by manually skipping past all 1000 debuggers.

x = h(str(x)) = '‚↵š'

The Unicode value for each of those characters are [130, 30, 10, 154].

Anti-Debugging

Aside from the debugger for loop, the function seems pretty harmless against debuggers.

However, if the developer tools is open when a password is entered, the page freezes and starts to eat up a lot of memory, indicating a possible infinite loop.

This is due to another subtlety in the function.

Subtlety #3: Regex objects don't have the length property.

function c(a, b, c) {
  for (i = 0; i != a.length; i++)
    c = (c || '') + chr(ord(str(a[i])) ^ ord(str(b[i % b.length])));
  return c;
}
c(source, x);

So source.length = undefined. However, the for loop in c() breaks when i != a.length. But a.length = source.length = undefined, so the break is i != undefined. This results in an infinite loop whenever c() is called with the regex object source. Since source.toString() calls c(), source can never be casted in a string either. Unfortunately, this can happen at multiple places:

  • Chrome's debugger displays the current value of all identifiers when at a breakpoint. This is done by getting their string representation. So, breakpoints can't be used.
  • console.log() also casts to string, so the seemingly harmless debug statement will freeze the script too.

However, we should be able to change x() now since it was overridden by h(str(x)). A quick fix is to check if a.length exists in c(), and throw an error if it doesn't.

function c(a, b, c) {
  if (!a.length) throw 'to string';
  for (i = 0; i != a.length; i++)
    c = (c || '') + chr(ord(str(a[i])) ^ ord(str(b[i % b.length])));
  return c;
}

Finally, c() is called inside the nested eval. So it appears like the function never finishes at all. But with the developer tools closed, it terminates. How?

Deciphering "source"

Simply put, the with(expression) keyword takes the expression and makes every property of the expression accessible inside the code block. For example,

with (Math) {
  console.log(PI); // 3.141592653589793
  console.log(cos(0)); // 1
}

Subtlety #4: The RegExp.source property returns a string containing the source text of the regex.

with (source) return eval('eval(c(source,x))');

In this case, source.source = 'Ӈ#7ùª9¨M¤ŸÀ.áÔ¥6¦¨¹.ÿÓÂ.։£JºÓ¹WþʖmãÖÚG¤ ¢dÈ9&òªћ#³­1᧨'. So, the source inside the nested eval actually refers to the string of the regex. More importantly, this doesn't require casting the regex into a string. It avoids calling source.toString() and running into the infinite loop.

return eval('eval(c(source.source,x))');

Nested eval

By now, most of x() can be removed without affecting how it works:

function x(х) {
  ord = Function.prototype.call.bind(''.charCodeAt);
  chr = String.fromCharCode;
  str = String;
  function h(s) {
    for (i = 0; i != s.length; i++) {
      a = ((typeof a == 'undefined' ? 1 : a) + ord(str(s[i]))) % 65521;
      b = ((typeof b == 'undefined' ? 0 : b) + a) % 65521;
    }
    return chr(b >> 8) + chr(b & 0xff) + chr(a >> 8) + chr(a & 0xff);
  }
  function c(a, b, c) {
    if (!a.length) throw 'to string';
    for (i = 0; i != a.length; i++)
      c = (c || '') + chr(ord(str(a[i])) ^ ord(str(b[i % b.length])));
    return c;
  }
  x = '‚↵š';
  source = `Ӈ#7ùª9¨M¤ŸÀ.áÔ¥6¦¨¹.ÿÓÂ.։£JºÓ¹WþʖmãÖÚG¤
¢dÈ9&òªћ#³­1᧨`;
  try {
    return eval('eval(c(source,x))');
  } catch (e) {}
}

The final piece of the puzzle is understanding the nested eval statement. The value of c(source, x) is:

c(source, x) = "х==c('¢×&Ê´cʯ¬$¶³´}ÍÈ´T—©Ð8ͳÍ|Ԝ÷aÈÐÝ&›¨þJ',h(х))//᧢"

That string is valid Javascript, and can be evaluated:

х == c('¢×&Ê´cʯ¬$¶³´}ÍÈ´T—©Ð8ͳÍ|Ԝ÷aÈÐÝ&›¨þJ', h(х)); //᧢

If the equality is true, then the statement simplfies to

eval('eval(true)');
eval('true');
true;

Otherwise, it simplfies to false.

One last point to notice: both х's in the above statement are the Cryllic letter, so it's the original password passed into x()!

So the problem is to find an input where input == c('¢×&Ê´cʯ¬$¶³´}ÍÈ´T—©Ð8ͳÍ|Ԝ÷aÈÐÝ&›¨þJ', h(input)) is true.

Reversing c() and h()

Trivia: h() is a slightly obfuscated implementation of the Adler-32 checksum algorithm, and it returns a 32-bit checksum (normally b << 16 | a).

function h(s) {
  for (i = 0; i != s.length; i++) {
    a = ((typeof a == 'undefined' ? 1 : a) + ord(str(s[i]))) % 65521;
    b = ((typeof b == 'undefined' ? 0 : b) + a) % 65521;
  }
  return chr(b >> 8) + chr(b & 0xff) + chr(a >> 8) + chr(a & 0xff);
}

In our case, it returns a string of four characters. Each of these characters have Unicode values in the range [0, 0xff = 255].

In c(), the function only grabs the Unicode value of the second argument b (which is h(input)). So, we need to find four numbers a,b,c,d in the range [0, 0xff] such that

input ==
  c('¢×&Ê´cʯ¬$¶³´}ÍÈ´T—©Ð8ͳÍ|Ԝ÷aÈÐÝ&›¨þJ', chr(a) + chr(b) + chr(c) + chr(d));

Rather than trying out all 256^4 = 2^32 values, we can make some assumptions.

  1. We know that the password is made up of alphanumberic characters and some punctuation because of the initial regex that checked for CTF{ ... }.
/^CTF{( [0-9a-zA-Z_@!?-]+ )}$/

So, we only need to consider results of c() that only contain the same set of characters.

  1. c() xor's the Unicode value of characters in a with those in b, but we know which index of b will be used to find c[i]:
i a[i] b[i] c[i]
0 a[0]='¢' b[0] a[0]^b[0]
1 a[1]='×' b[1] a[1]^b[1]
2 a[2]='&' b[2] a[2]^b[2]
3 a[3]='' b[3] a[3]^b[3]
4 a[4]='Ê' b[0] a[4]^b[0]
5 a[5]='Ê' b[1] a[5]^b[1]
... ... ... ...

This means c() can be calculated independently in four parts: for each character in b. For example, given b[0], we can find the string c[0] + c[4] + c[8] + .... The characters in this string must also be in the set of the regex above. Hence, we can brute force each character of b separately, which needs 4 * 256 = 1024 checks.

I wrote a script that checks all characters of b and print those that return a string with valid characters.

ord = Function.prototype.call.bind(''.charCodeAt);
chr = String.fromCharCode;
str = String;

// Calculate c(a, b) but only the characters made from b[index]
function test(a, b, index) {
  c = '';
  for (i = index; i < a.length; i += 4)
    c = c + chr(ord(str(a[i])) ^ b[i % b.length]);
  return c;
}

// The valid set of characters
let chars = /^[0-9a-zA-Z_@!?-]+$/;

for (let index = 0; index < 4; index++) {
  for (let i = 0; i <= 0xff; i++) {
    let array = [0, 0, 0, 0];
    array[index] = i;

    let out = test('¢×&Ê´cʯ¬$¶³´}ÍÈ´T—©Ð8ͳÍ|Ԝ÷aÈÐÝ&›¨þJ', array, index);
    let valid = chars.exec(out);
    if (valid) {
      console.log(index, i, out);
    }
  }
}

We get the following output:

0 253 '_7RN5TNa-U'
1 149 'B!9!!EXbHk'
1 153 'N-5--ITnDg'
2 21 '3v1hA-it3_'
3 249 'x3O4n4-1b'

We can generate the original password by interleaving the characters from each index. Since there's two possible values for b[1], we have two choices for the password: _B3x7!v3R91ON!h45!AnTE-4NXi-abt1-H3bUk_ or _N3x7-v3R51ON-h45-AnTI-4NTi-ant1-D3bUg_.

Flag: CTF{_N3x7-v3R51ON-h45-AnTI-4NTi-ant1-D3bUg_}

Overall, there were several tricks used here to pull off this challenge:

  • Using Cryllic letters to confuse identifiers
  • Incorporating the source code during runtime
  • Multiple anti-debugger tricks

    • Calling the debugger repeatedly
    • Executing separate code when the debugger is open
  • Clever variable naming along with the with keyword
  • Reversing a checksum