Topological Sort

Leetcode had another good question this morning which involved using topological sort, which I hadn’t heard of until today. According to the wiki: “a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering”. In other words, given a bunch of vertices and directed edges, a topological sort finds an ordering where parent nodes always occur before child nodes in the graph. There may be more than one correct answer to this type of problem.

For example, the graph:

could have multiple answers. According to wikipedia:

  • 5, 7, 3, 11, 8, 2, 9, 10 (visual left-to-right, top-to-bottom)
  • 3, 5, 7, 8, 11, 2, 9, 10 (smallest-numbered available vertex first)
  • 5, 7, 3, 8, 11, 10, 9, 2 (fewest edges first)
  • 7, 5, 11, 3, 10, 8, 9, 2 (largest-numbered available vertex first)
  • 5, 7, 11, 2, 3, 8, 9, 10 (attempting top-to-bottom, left-to-right)
  • 3, 7, 8, 5, 11, 10, 2, 9 (arbitrary)

are all valid topological sorts.

Now that we know a bit about topological sorting, let’s move on to the question.

The Question

The question is asking about prerequisites for courses. It asks if there is a valid ordering to take the classes based upon the list of prereqs. In other words, is there a valid way to topologically sort the given nodes based on the edges of prerequisites.

A Solution

I adapted the topological sort from Wikipedia in my solution, I also built a simple Node class which has a status to help with the sort. This is an accepted solution:

The main thing that’s different here is that my visit function will return False if it ever encounters a node marked T in any of its recursive calls. In other words, if there is a cycle the entire function will return False and my canFinish function will also return False. If we can successfully traverse our nodes, that means that there is a valid topological sort and we can return True. Note how easily this could be changed to instead return a valid topological sort.

Complexity

I build the list of nodes, which will happen in O(n) time where n is the number of courses. I also have to add all of the edges, which will happen in O(e) time where e is the number of prerequisite edges. I then perform the topological sort which is linear with regard to n. I can’t think of a valid graph where e > n, but an invalid graph could contain more prerequisite edges than the number of courses. Therefore, I suggest that the time complexity is O(max(n, e)).

For space, I store n nodes and e edges. The edges are not necessarily dependent on the number of nodes. I suggest the space complexity is O(n+e).

Leetcode | Compare Version Numbers

I decided to do an easy one today, but it still had a few good tricks! Here is the prompt:

Ok, seems straightforward enough. Let’s think this through. How are we comparing version numbers? Well, the left-most number has the most priority, one to the right the second most priority, etc… we should just be able to compare these version numbers in order from left to right. Right?

Well, one caveat we need to consider is what happens if one version number is longer than another. We could check the lengths and return the longer one… but there is an edge case there that doesn’t work out (consider 1.0 and 1). So what else can we do?

Simply, just make them the same length! Zero-pad the shorter entry on the right so that they’re the same length. Then we can simply compare. Take a look:

I could clean up my padd function a bit, but it gets the job done. And that’s it!

Complexity Analysis

Let’s see, we may have to pad one of the numbers with zeros, which will take O(d) time where d is the difference in length between the two strings. Then we have to go through the length of the split version numbers once. This is O(n). This term will dominate the zero padding so I think it’s safe to ignore it. Therefore the time complexity is O(n) where n is the size of the longest version number after splitting.

For space, we save two split version numbers which is by definition O(n) as well. Therefore the space complexity is O(n) as well.

I think if we didn’t split into arrays and instead just did the comparisons on the original strings we could get this to constant space, but this speedup doesn’t really seem worth it to me in this problem.

Leetcode | Word Ladder II

In preparation for fulltime interviews in the coming months I’ve been doing a lot of leetcode questions. I have many feelings on whether or not this is the best way to determine someone’s coding abilities, but that’s for a different post. I wanted to share my struggles with one particular question: word ladder ii. I’m not going to paste the prompt here, so if you’re reading this maybe read the prompt first.

Why did I find this one particularly challenging? I’m not sure, exactly. I know all of the strategies involved, I’ve done similar questions before, I know all of the data structures… I think just the size of the problem perhaps got to me. Anyway, I’m going to go over a few failed attempts of mine (they both worked, just timed out on some test cases) followed by a working solution. I’ll conclude with some final thoughts.

Solution 1

My first mistake was I just started coding. This usually ends poorly for me with these types of questions. I tried to come up with a strategy but since I was timing myself, I figured the solution would unfold as I went. This is what my first code looked like:

The basic strategy here was to do a DFS which generated possibilities on the fly (potential next words) and stopped once we found the endWord. Once I did, I saved the min_so_far to save how deep in the graph we are. This way, we never go deeper than that and can cut off searches early. Unfortunately, this had many flaws. Some include:

  • DFS instead of BFS. DFS means that we could end up searching deeper than we need to. More on this later.
  • Not using good data structures. Everything is saved as an array here. For searching repeatedly or checking for existence dictionaries and sets would work better.
  • Overly recursive. Sure, the recursive solution may seem sexy but it’s rarely the fastest and in this case was the final nail in the coffin of this solution.

To be clear, this solution worked it was just very slow.

Solution 2

This solution DID perform much better, but unfortunately it was just adding enhancements to my already poor code. Here is what it looked like:

OhhHHhh! Look how shiny and modular everything is! He even added a Node class. Unfortunately, at this point I didn’t realize that the problem was DFS and so even though this performed way better, it still wasn’t good enough to pass the tests on leetcode. TBH, in an interview this would probably fly because the code is sound. But, again, missing the benefits of BFS here is a silly mistake that wouldn’t have been good in an interview.

Solution 3

At this point I was depressed, sad, and stuck. So, I looked at the discussion section and found a great python solution that I adapted to get things working finally. Here is the code:

The author had a couple of good strategies which I will consider for future problems. Namely:

  • Instead of calculating hamming distance with every word in the wordList, just iterate through every letter. This caps the number of combinations you try at each level and works well when your words are small and your wordLists are huge.
  • BFS first to generate the graph and only then DFS to search the graph. BFS for graph generation means that we will ALWAYS find the shortest distance and will never search deeper than that. There might be other solutions ending at that layer, but we will never search below that.
  • set()‘s and {}‘s save the day here. Lookup and existence checking are much faster than arrays for these data structures.

Now running this code on even the largest dataset in the test scenarios yields:

I’d call that a victory!

Complexity Analysis

#TODO