In the previous post, we discussed some uninformed search algorithms. The reason they performed so inefficiently is, well unsurprisingly, because they were uninformed. They used no knowledge about features of the search space. It’s frustrating and sometimes hilarious to see your agent running around the entire search space, always missing a goal node nearby. For most problems, we always have some indication of which nodes are better than others; which nodes are closer to a solution and which are not, and it would be nice if we could inject this knowledge in our agent so that it can direct its search towards more promising paths and find a solution more efficiently while exploring as little of the search space as possible.
The key idea behind informed search algorithms is that of heuristics. So far, our search algorithms only required a goal test. Given a node, they could only find out whether it is a goal or not. A heuristic provides more information; given a node, it provides an estimate of its distance from a goal. Unlike a search algorithm which works for any search problem, a heuristic is designed for a particular search problem. For the routing problem (finding a path from one city to another), we could use Euclidean distance as heuristic. For the Pacman problem discussed in the previous post, Manhattan distance could be used as a heuristic. We’ll return to this idea of heuristics to inject domain specific knowledge in search algorithms and how to design them for a problem at hand but let’s first see some algorithms which can use these heuristics.
Greedy Search chooses the node, which appears to be closest to a goal, for expansion. It is very similar to UCS; the only difference being that it orders nodes by their heuristic estimate (estimated cost to a goal), rather than their path costs from the start state.
Usually, greedy search will move straight towards a non-optimal goal. In the worst case, however, it would perform like a badly behaved DFS, exploring everything except the solution nodes first.
Properties of Greedy Search:
- Completeness: Only if we prevent cycles, same as DFS
- Optimality: No
- Time Complexity: O(bm), where b is the branching factor and m is the maximum depth.
- Space complexity: O(bm) for the tree search version and O(bm) for the graph search version
A* combines the ideas behind both greedy search and UCS. It uses the sum of path cost from start state and the heuristic estimate to a goal state, to order nodes for expansion, i.e., A* orders nodes by g(n)+f(n), where g(n) is the cumulative path cost (backward cost) from start state to state n and f(n) is the heuristic estimate (forward cost) from state n to a goal node.
A* has many interesting properties and deserves a post of its own. It just so turns out that I’d earlier written a detailed post on A*.
Properties of A* Search:
- Completeness: Yes
- Optimality: Yes, if we use an admissible heuristic
- Time Complexity: Highly dependent on the heuristic
- Space complexity: Highly dependent on the heuristic
Space and time complexities of A* are highly dependent on the heuristic used. If we use a null heuristic (which basically returns 0 for every node), A* reduces to UCS. If we use a better heuristic, we can reduce the explored search space. In general, A* performs much better than UCS if we use a good heuristic.
In the next post, we’ll discuss more about heuristics, their desirable properties and how we can build a heuristic for a search problem at hand.