Google CTF | Reverse a Cellular Automata

This is a writeup for one of the challenges from the Google CTF which I participated in a few weeks ago. My friend Chris and I were finally able to get this flag after a lot of hitting our heads together. The Reverse a Cellular Automata challenge (may not be up in the future) was a lot of fun.

This crypto category flag gave us the base64 encoded flag and the encryption key after one iteration of Wolfram’s rule 126 was ran on it. Our job was to move backward one step and get the actual key.

Wolfram Rule 126

This rule is a bit similar to the Game of Life. In this case, take an n bit number, for example 101110, and run through the following rules:

  • For every bit, if the value is the SAME as each of its two neighbors, its next value is 0
  • Otherwise, its value is 1
  • The least/most significant bits are neighbors with each other

So in the example above, if we’re dealing with 101110 and run rule126 on it, our next value would be: 111011.

It’s trivial to go forward one step, it’s a bit more difficult to go backward which is what this problem requires.

Reverse Rule 126

We didn’t prove this, but pragmatically speaking, given a number N there is at least one ancestor n' where rule126(n') == N. Let’s call the set of all values of n' where rule126(n') == N‘. It seems that, in many cases |N'| is rather high. That is, the number of potential values of n' ∈ N' is large.

The most obvious way to solve this problem would be to brute force it. But since the encryption key is 64 bits this would take about 545 years so we decided to go a different route. We needed to limit our search space.

In order to do so, we thought about the rule126 rules in reverse. For instance, what does it mean if a given bit in N is a 0? Well, it means that in the previous ancestor n' that bit was the same as each of its neighbors. In other words, if we saw a 0 in bit b_1 then in n' bits {b_0,b_1,b_2} were either all 0 or all 1. The opposite is true if we encounter a 1 in b_1. That is, bits {b_0,b_1,b_2} were NOT either all 0 or all 1.

As we talked this through, it became apparent that we could use a SAT solver to limit our search space.

SAT Solvers

SAT Solvers, or satisfiability solvers, take a number of constraints and determine if the constraints are satisfiable. That is, if there is some solution which will satisfy all of the constraints.

In order to use one, we needed to write our rules above as a series of constraints. Let’s say we have three bits: A, B, and C where B is some bit b_i and A and C are B‘s left and right neighbors respectively. If the value of B is 0 then we know that in n':

In other words, either all of those bits are on or they’re all off.

Similarly if the value of B is 1 then:

Unfortunately, SAT solvers do not take logical propositions like the ones above. They need to take constraints in something called Conjunctive Normal Form (CNF). Luckily, any valid logical proposition can be written in CNF and there are lots of tools to automatically rewrite propositions in CNF.

to_cnf

We were able to use a python library called sympy which has the ability to rewrite propositional logic into CNF. This was ideal for our use case, and we were able to quickly generate our logic above into equivalent logic in CNF. This was then fed into our sat solver and we were on our way. I won’t go into all of the rewrites, but see our final solution below to get a feel for how this worked.

Getting Ancestors

Running our solver gave us gave us 10,753 different possible values for n' which was honestly much larger than I expected. Not to worry though, we saved each valid n' as a binary key and then ran our decryption script with each key using the following code:

In other words, if the decryption had an error we directed it to /dev/null, otherwise we checked to see if CTF was in the output and we had our flag. The decryption code above was given to us in the question.

And we got it! Though it’s not really incredibly helpful, I thought it was interesting:

Solution Code

Links

From another CTF Player:

RxJS Lap Button (From Koutnik’s RxJS book)

In the first chapter of Randall Koutnik’s book Building Reactive Websites with RxJS he has a challenge where he asks the reader to add a lap button to the stopwatch project. This lap button essentially freezes the output while continuing to count behind the scenes. I couldn’t figure this out for a little while, and found no online resources to help me out. After some head banging I was able to figure out a solution:

I thought this was a pretty neat solution! Here’s the breakdown. lapClick$ is an Observable<boolean> which starts with false. Whenever it receives an emission, the scan will invert its current condition. That is, when the lap button is clicked once it becomes true then false etc.

Next, we use combineLatest to actually see into lapClick$ in our main function. We filter based on if the lapClick$ is true or not, if it’s not we continue with our normal logic. That’s it! I thought this was a fun solution but it took some thinking to actually arrive at the answer.

How to Secure Your WordPress Website Against Russian Hackers

I woke up on Tuesday to a strange email from my WordPress website:

That’s weird, I don’t remember signing up a new user onto my site. Turns out, as is tradition, WordPress had some kind of vulnerability which allowed my site to get pwned. I needed to investigate further.

When I first went to my site it was redirecting to some insecure 3rd party website (in my haste to fix it I lost the actual url, but it was something like getfreetraffic.com ). But since I couldn’t actually access the backend of WP I had no more information. I continued on to the access logs.

Around the time the emails were sent I saw some really strange log entries:

The 185 IP address is from the netherlands and the 192 IP appears to be from LA. It looks like someone was trying to take advantage of xmlrpc and perhaps the jetpack plugin to pwn my site.

Also worth of note was that almost immediately after this happened I saw a TON of GET requests from an ahrefs.com robot for very pornographic search terms. Most of which were hilarious but I probably shouldn’t post on my site regardless. If I had to guess, the attacker took over my site to forwarded all traffic to a site they control and serve ads. They then sent a bunch of traffic there via ahrefs to try and get my site to show up for long tail pornographic keywords. They then hoped that people would click through my site which would redirect them to the site they controlled and give them ad revenue. Very convoluted but hey, maybe it would work!

Now that I have a theory as to what happened, I was still not any closer to actually fixing the problem.

After having spent many years working on WordPress websites, I had a good idea of where to look for malicious redirections first. So I looked through all of the theme files in wp-content to see if anything looked odd. This proved unsuccessful. I grep‘d the entire directory for terms like traffic and redirect and free to no avail. For the moment it seemed like the problem wasn’t in the site files.

Next was the database. I looked in wp_options and almost immediately saw the problem: the attacker changed my site_url and home values to this getfreetraffic site. Bingo! Once I changed this back to my website things started to work again and I could access the backend of my website once again.

Hardening My Site

I’ve been meaning to secure my website for a long time now, and this was the perfect excuse to actually do so. So after spending some time to make sure that the attackers didn’t leave any further backdoors, I went off securing my site.

The first thing I did was to not make WordPress accessible to the world anymore. Unfortunately WordPress is not secure and there’s nothing I can do about it, I’ve now restricted it to a few IP addresses with a username/password. Using the simply static plugin, I made it so my public facing site is now static and needs to be rebuilt whenever I update the site. This required a lot of apache2 server config but now that it’s setup it’s dead simple to use. Just hit generate site whenever I post and we’re done. Now the only IPs that have access to WordPress are me, which should make it a lot harder to exploit any WP vulnerabilities in the future.

Oh, I also deleted the jetpack plugin and disabled xmlrpc site wide. A little overkill since the IPs are so restricted but I don’t trust these mechanisms anymore so it just felt right to do.

So, wat?

TL;DR it looks like someone named Devid in Russia was running an automated script looking for sites running a vulnerable version of WordPress and/or the Jetpack plugin. They found my site and redirected its (pretty much nonexistent) to get ad monies. I hardened my site so they can’t do this anymore. Hopefully, in the near future, I will stop seeing so many pornographic terms being searched for on my site. Sorry WordPress, you are not to be trusted.

A Simple ROP Exploit | poor_canary writeup

HXP 2018 has a “baby” challenge called poor_canary which was my first actual ROP exploit. If you want to follow along you’ll need to download and install the hxp 2018 vm image and install it locally. If you’re on a mac, you might need to port forward in order to get the image working properly. I have the following in my .zshrc:

That way I can just open up a new screen with my vm running and run hxp to have my vm serve the hxp site to localhost:8888 and if I want to forward one of the services I can run something like hxp 18113 and then nc localhost 18113 will work as expected. I’d like to give a huge shout out to the ENOFLAG team who’s writeup helped me comprehend this challenge. Now that that’s settled, let’s jump in.

Pwntools

I’d never used pwntools before, but it’s something that is completely necessary for this challenge. I set it up in a virtual environment and just source that anytime I want to use it.

Solution

Looking at the c code provided shows us that this is a buffer overflow exploit, but the title leads us to believe that the stack has a canary in it. Looking at the binary in binary ninja confirms this hunch.

__stack_chk_guard check the canary

We can also see that the function pointer to systemis saved after main so we can guess that this address should be somewhere in the binary.

Further analysis of the binary shows us that __libc_system is located at address 0x0016d90.

Now that we know this information, we need to start messing with payloads to send to the binary. Running the command pwn template --host 127.0.0.1 --port 18113 ./canary will generate code to connect to a remote host and send payloads to it. This is part of pwntools. Once we connect to the remote, we can send some code like this:

What does this actually do? Well looking at the c code we can see that we are allocated a 40 character buffer, but we can read in 0x60 characters. It’s also important to note that many canary’s start with 0x00 (why? because this makes it harder to get the canary because this acts like a null terminator). But, in our binary, if we just send 41 characters this will overwrite the null byte and send the remainder of the canary! Then we add back 0x00to get our canary and we’re ready to move forward.

Next comes finding the string /bin/sh, since this is what we’d ideally like to pass to system to spawn a shell. So let’s use ropper to see if there is anything in our binary.

Well that was easy; ropper rocks!

Now, where do we put this string in order to have it be run by system? Using godbold to help us we can look at the ARM assembly for system("/bin/sh");.

Great, so we push our command into register r0 and then system will use that as its argument. Now let’s find some rop gadgets that put things into r0. Keep in mind that ARM is a little weird and uses the concept of regliststo push and pop multiple registers at the same time.

pop {r0, r4, pc}looks ideal. This will take the next three words off the stack and save them into r0, r4, and pc respectively. And with this we can construct our payload! We have 40 bytes of garbage to fill up the buffer, the canary, 12 bytes of garbage (this I just took from the other writeup, if someone could explain exactly how to get this 12 byte offset for the return address that would be really helpful), our ROP gadget, the address of /bin/sh, 4 bytes of garbage (we don’t care what gets saved into r4), and finally the address of system.

To fully lay it out, this will overwrite the return address to be our ROP gadget, which will then pop the address of /bin/sh into r0 and change the PC to the system function. This is essentially calling system("/bin/sh");. Our full exploit script looks like so:

And this gives us access! When we run:

Trie as you Might

A trie is a data structure that makes it easy to search for words or prefixes in a corpus. The structure of the trie reduces the complexity of searching for a word or a prefix down from O(n) where n is the number of words in the dictionary to O(m)where m is the length of the query [1]. Pretty sweet, huh?

Visualization of a Trie, from wikipedia.

So when I ran across a leetcode question which needed a trie, I realized that I’d never actually written one before. Hence this blog post. Before I get into my implementation I want to give a huge shout out to Shubhadeep Roychowdhury who wrote an awesome blog post on the trie structure. I based a lot of my code on his implementation of the trie.

With that being said, I built a few additions to Roychowdhury’s trie. First of all there is a Trie class itself which contains TrieNode‘s. I also built out the __contains__ magic method so you can search within your trie using the in keyword. So here we go:

Side note, see those type hints?! Apparently in python 3.5 they added them as ways to help indicate the types that are expected (and returned) from functions. Python is still totally dynamically typed, but they serve as hints to the reader.

Anyway, the code is pretty self explanatory but if you have any questions please leave them in a comment below. Also, if you use this code note that it is not battle tested, so act accordingly.

Topological Sort

Leetcode had another good question this morning which involved using topological sort, which I hadn’t heard of until today. According to the wiki: “a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering”. In other words, given a bunch of vertices and directed edges, a topological sort finds an ordering where parent nodes always occur before child nodes in the graph. There may be more than one correct answer to this type of problem.

For example, the graph:

could have multiple answers. According to wikipedia:

  • 5, 7, 3, 11, 8, 2, 9, 10 (visual left-to-right, top-to-bottom)
  • 3, 5, 7, 8, 11, 2, 9, 10 (smallest-numbered available vertex first)
  • 5, 7, 3, 8, 11, 10, 9, 2 (fewest edges first)
  • 7, 5, 11, 3, 10, 8, 9, 2 (largest-numbered available vertex first)
  • 5, 7, 11, 2, 3, 8, 9, 10 (attempting top-to-bottom, left-to-right)
  • 3, 7, 8, 5, 11, 10, 2, 9 (arbitrary)

are all valid topological sorts.

Now that we know a bit about topological sorting, let’s move on to the question.

The Question

The question is asking about prerequisites for courses. It asks if there is a valid ordering to take the classes based upon the list of prereqs. In other words, is there a valid way to topologically sort the given nodes based on the edges of prerequisites.

A Solution

I adapted the topological sort from Wikipedia in my solution, I also built a simple Node class which has a status to help with the sort. This is an accepted solution:

The main thing that’s different here is that my visit function will return False if it ever encounters a node marked T in any of its recursive calls. In other words, if there is a cycle the entire function will return False and my canFinish function will also return False. If we can successfully traverse our nodes, that means that there is a valid topological sort and we can return True. Note how easily this could be changed to instead return a valid topological sort.

Complexity

I build the list of nodes, which will happen in O(n) time where n is the number of courses. I also have to add all of the edges, which will happen in O(e) time where e is the number of prerequisite edges. I then perform the topological sort which is linear with regard to n. I can’t think of a valid graph where e > n, but an invalid graph could contain more prerequisite edges than the number of courses. Therefore, I suggest that the time complexity is O(max(n, e)).

For space, I store n nodes and e edges. The edges are not necessarily dependent on the number of nodes. I suggest the space complexity is O(n+e).

Leetcode | Compare Version Numbers

I decided to do an easy one today, but it still had a few good tricks! Here is the prompt:

Ok, seems straightforward enough. Let’s think this through. How are we comparing version numbers? Well, the left-most number has the most priority, one to the right the second most priority, etc… we should just be able to compare these version numbers in order from left to right. Right?

Well, one caveat we need to consider is what happens if one version number is longer than another. We could check the lengths and return the longer one… but there is an edge case there that doesn’t work out (consider 1.0 and 1). So what else can we do?

Simply, just make them the same length! Zero-pad the shorter entry on the right so that they’re the same length. Then we can simply compare. Take a look:

I could clean up my padd function a bit, but it gets the job done. And that’s it!

Complexity Analysis

Let’s see, we may have to pad one of the numbers with zeros, which will take O(d) time where d is the difference in length between the two strings. Then we have to go through the length of the split version numbers once. This is O(n). This term will dominate the zero padding so I think it’s safe to ignore it. Therefore the time complexity is O(n) where n is the size of the longest version number after splitting.

For space, we save two split version numbers which is by definition O(n) as well. Therefore the space complexity is O(n) as well.

I think if we didn’t split into arrays and instead just did the comparisons on the original strings we could get this to constant space, but this speedup doesn’t really seem worth it to me in this problem.

Leetcode | Word Ladder II

In preparation for fulltime interviews in the coming months I’ve been doing a lot of leetcode questions. I have many feelings on whether or not this is the best way to determine someone’s coding abilities, but that’s for a different post. I wanted to share my struggles with one particular question: word ladder ii. I’m not going to paste the prompt here, so if you’re reading this maybe read the prompt first.

Why did I find this one particularly challenging? I’m not sure, exactly. I know all of the strategies involved, I’ve done similar questions before, I know all of the data structures… I think just the size of the problem perhaps got to me. Anyway, I’m going to go over a few failed attempts of mine (they both worked, just timed out on some test cases) followed by a working solution. I’ll conclude with some final thoughts.

Solution 1

My first mistake was I just started coding. This usually ends poorly for me with these types of questions. I tried to come up with a strategy but since I was timing myself, I figured the solution would unfold as I went. This is what my first code looked like:

The basic strategy here was to do a DFS which generated possibilities on the fly (potential next words) and stopped once we found the endWord. Once I did, I saved the min_so_far to save how deep in the graph we are. This way, we never go deeper than that and can cut off searches early. Unfortunately, this had many flaws. Some include:

  • DFS instead of BFS. DFS means that we could end up searching deeper than we need to. More on this later.
  • Not using good data structures. Everything is saved as an array here. For searching repeatedly or checking for existence dictionaries and sets would work better.
  • Overly recursive. Sure, the recursive solution may seem sexy but it’s rarely the fastest and in this case was the final nail in the coffin of this solution.

To be clear, this solution worked it was just very slow.

Solution 2

This solution DID perform much better, but unfortunately it was just adding enhancements to my already poor code. Here is what it looked like:

OhhHHhh! Look how shiny and modular everything is! He even added a Node class. Unfortunately, at this point I didn’t realize that the problem was DFS and so even though this performed way better, it still wasn’t good enough to pass the tests on leetcode. TBH, in an interview this would probably fly because the code is sound. But, again, missing the benefits of BFS here is a silly mistake that wouldn’t have been good in an interview.

Solution 3

At this point I was depressed, sad, and stuck. So, I looked at the discussion section and found a great python solution that I adapted to get things working finally. Here is the code:

The author had a couple of good strategies which I will consider for future problems. Namely:

  • Instead of calculating hamming distance with every word in the wordList, just iterate through every letter. This caps the number of combinations you try at each level and works well when your words are small and your wordLists are huge.
  • BFS first to generate the graph and only then DFS to search the graph. BFS for graph generation means that we will ALWAYS find the shortest distance and will never search deeper than that. There might be other solutions ending at that layer, but we will never search below that.
  • set()‘s and {}‘s save the day here. Lookup and existence checking are much faster than arrays for these data structures.

Now running this code on even the largest dataset in the test scenarios yields:

I’d call that a victory!

Complexity Analysis

#TODO

Customize Field Errors with Rails 5 & Bootstrap 4

Building a new app for one of my clients I had the need to customize Rails’ internal form error fields. By default Rails will add the field_with_errors class to an invalid input and call it a day. Unfortunately, if you’re using Bootstrap 4 (I’m using Beta) then you’re going to need to do a lot of customization in order for this to look right.

But, I hate trying to fight Bootstrap, so I set about trying to make Rails and Bootstrap play nice. I adapted this solution from rubyplus.org to work with the new Bootstrap 4 layouts. The gist is available on my github: Customize Field Errors with Rails 5 and Bootstrap.  It’s pretty simple, but here’s what you need to do.

Like I said, pretty simple but it makes things look a lot more snazzy.

What do you think? Comment here or on the gist to let me know what you think. Enjoy!

Solution: HackerRank Sherlock and Moving Tiles in Ruby

This one was pretty fun! To be honest, I found it a little tricky at first but once I got the hang of it the math was pretty easy. The key is understanding that the velocity of each square is how much they each move on the line y = x every interval. NOT how much to increment each (x,y) coordinate every interval as I naively thought.

So, first of all, let’s check out the question prompt and get a solid grasp of the question. Then let’s do some math to solve for t, which is the amount of time needed for square q to be the proper area.

     \begin{document}t = time \\ v_1 = velocity \ of \ \square_1 \\ v_2 = velocity \ of \ \square_2 \\ l_q = length \ of \ side \ \square_q \\ x = side \ of \ the \ right \ \triangle \ with \ bottom-left \ corner \ of \ \square_1 \ and \ \square_2 \ as \ vertices \\ \\ \begin{equation*} Diagonal \ between \ \square_1 \ and \ \square_2 \\ &= t \cdot v_2 - t \cdot v_1 \\ &= t(v_2 - v_1) \vspace{5mm} \\ \end{equation*} Pythagorean \ Theorem \ gives \ us: \\ \begin{equation*} [t(v_2 - v_1)]^2 = 2 \cdot x^2 \\ \end{equation*} \begin{equation*} x &= \frac {t \cdot (v_2 - v_1)} {\sqrt{2}} \end{equation} \vspace{5mm} To find l_q: \\ \begin{equation*} l_q = l - x = l - \frac {t \cdot (v_2 - v_1)} {\sqrt{2}} \\ l_q - l = - \frac {t \cdot (v_2 - v_1)} {\sqrt{2}} \\ l - l_q = \frac {t \cdot (v_2 - v_1)} {\sqrt{2}} \\ \sqrt{2} \cdot (l - l_q) &= t \cdot (v_2 - v_1) \\ t &= \frac {\sqrt{2} \cdot (l - l_q)} {(v_2 - v_1)} \end{equation} \end{document}

Apologies for the LaTeX with the poor formatting, I’m not entirely used to it yet. Perhaps with a few more of these problems I will be!

Anyway, when we solve for time we’re basically looking for when one side of square q will be the square root of the sought-after area. In order to implement this in Ruby I built a simple PORO that just implements the equation from above in Ruby.

Remember, though, that the velocity of square 1 could be larger than the velocity of square 2. The equation above was built with the velocity of square 2 being greater than the velocity of square 1. There are a couple ways to account for this, but I simply just took the absolute value of the result to achieve the correct answer.

This worked for every test case. Were you able to solve this problem? What strategies did you use to do so?