Topological Sort
Leetcode had another good question this morning which involved using topological sort, which I hadn’t heard of until today. According to the wiki: “a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering”. In other words, given a bunch of vertices and directed edges, a topological sort finds an ordering where parent nodes always occur before child nodes in the graph. There may be more than one correct answer to this type of problem.
For example, the graph:
could have multiple answers. According to wikipedia:
- 5, 7, 3, 11, 8, 2, 9, 10 (visual left-to-right, top-to-bottom)
- 3, 5, 7, 8, 11, 2, 9, 10 (smallest-numbered available vertex first)
- 5, 7, 3, 8, 11, 10, 9, 2 (fewest edges first)
- 7, 5, 11, 3, 10, 8, 9, 2 (largest-numbered available vertex first)
- 5, 7, 11, 2, 3, 8, 9, 10 (attempting top-to-bottom, left-to-right)
- 3, 7, 8, 5, 11, 10, 2, 9 (arbitrary)
are all valid topological sorts.
Now that we know a bit about topological sorting, let’s move on to the question.
The Question
There are a total of n courses you have to take, labeled from 0 to n-1. Some courses may have prerequisites, for example to take course 0 you have to first take course 1, which is expressed as a pair: [0,1] Given the total number of courses and a list of prerequisite pairs, is it possible for you to finish all courses? Example 1: Input: 2, [[1,0]] Output: true Explanation: There are a total of 2 courses to take. To take course 1 you should have finished course 0. So it is possible. Example 2: Input: 2, [[1,0],[0,1]] Output: false Explanation: There are a total of 2 courses to take. To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible.
The question is asking about prerequisites for courses. It asks if there is a valid ordering to take the classes based upon the list of prereqs. In other words, is there a valid way to topologically sort the given nodes based on the edges of prerequisites.
A Solution
I adapted the topological sort from Wikipedia in my solution, I also built a simple Node
class which has a status
to help with the sort. This is an accepted solution:
class Node: def __init__(self, val): self.val = val self.edges = [] self.status = 'U' # U = untouched, T = temporary, P = permanent class Solution: def __init__(self): self.nodes = {} self.sorted_nodes = [] def visit(self, node): if node.status == 'P': return True if node.status == 'T': return False node.status = 'T' for e in node.edges: if not self.visit(self.nodes[e]): return False node.status = 'P' self.sorted_nodes.append(node) return True def canFinish(self, numCourses, prerequisites): # First build all the nodes self.nodes = {k: Node(k) for k in range(numCourses)} # Then build all of the edges for u, v in prerequisites: self.nodes[u].edges.append(v) # Then perform topological sort for key in self.nodes.keys(): if not self.visit(self.nodes[key]): return False return True
The main thing that’s different here is that my visit
function will return False
if it ever encounters a node marked T
in any of its recursive calls. In other words, if there is a cycle the entire function will return False
and my canFinish
function will also return False
. If we can successfully traverse our nodes, that means that there is a valid topological sort and we can return True
. Note how easily this could be changed to instead return a valid topological sort.
Complexity
I build the list of nodes, which will happen in O(n)
time where n
is the number of courses. I also have to add all of the edges, which will happen in O(e)
time where e
is the number of prerequisite edges. I then perform the topological sort which is linear with regard to n
. I can’t think of a valid graph where e > n
, but an invalid graph could contain more prerequisite edges than the number of courses. Therefore, I suggest that the time complexity is O(max(n, e))
.
For space, I store n
nodes and e
edges. The edges are not necessarily dependent on the number of nodes. I suggest the space complexity is O(n+e)
.