Thursday, August 30, 2012

A-star Search

Hello Folks,
Today I won't write code but instead of I'll speak about a famous algorithm to find optimal paths.

A* search

It evaluates nodes by combining g(n) and h(n), where g(n) is the cost to reach the current node and h(n) is the heuristic cost from the note to the goal.

So, by combining both we have f(n) that is the estimated cost of the cheapest solution.
This algorithm is both complete and optimal whether satisfies certain heuristic conditions.
In fact, the algorithm is identical to Uniform-Cost-Search, also known by the name Dijkstra, except that uses g(n) + h(n) instead of g(n).


Conditions for optimality

The main condition we require for optimality is that h(n) has to be an admissible heuristic. An admissible heuristic is one that never overestimates the cost to reach the goal.

In other words, g(n) is the minimal accumulated cost from the initial state to the current node and h(n) is heuristic cost from the current node to the goal. Thus, f(n) that is g(n) + h(n) is the cost from the initial state to the goal.
So, admissibility here means that h(n) has to be calculated heuristicly and this means that won't be the real cost.

So, if h(n) is bigger that h*(n), which is the real cost or the perfect heuristic, then f(n) = g(n) + h(n) would be bigger than the shortest path and due to wouldn't be optimal.

An example of an admissible heuristic is the straight-line distance h_SLD. Whether we want to calculate the shortest path between two cities h_SLD will be an excellent heuristic to be used because between two points always the shortest path is a straight-line and don't exist a better solution. Thus, this heuristic will be admissible because would never calculate a further path than the optimal solution.


A*: Additional Properties

A* store in memory all nodes visited so his complexity is exponential both time and space.
Thus, A* is not practical for many large-scale problems. There are, algorithms, that overcome the space problem without sacrificing optimality or completeness, at a small cost in execution time.
For instance, IDA*.

A* is optimal in another sense: no other algorithm expands less nodes than A* with same heuristic function but this doesn't mean that A* is always the fastest.
Another property is that better heuristic functions A* expands less nodes.

0 < h_1 < h_2 < h* 

With h_2 will expand less nodes than with h_1.


Example of heuristics

I won't explain any of it but just with the name you will be able to google it.

  • Manhattan distance in N-puzzle
  • Euclidean Distance in Routing Finding
  • Spanning Tree in Travelling Salesman Problem
  • Shortest Path in Job Shop Scheduling


Algorithm


Illustration of A* search for finding path from a start node to a goal node.

Extracted from wikipedia.










Example:


In order to solve the shortest path in the graph using A* I will use two list, one for the current nodes ( search frontier ) and another for the visited nodes ( in order to check loops ).

Opened

1                                  2                                3                               and so on
f(a) = 1.5 + 4 = 5.5       f(b) = 3.5 + 2 = 5.5      f(c) = 6.5 + 2 = 8.5
f(d) = 2 + 4.5 = 6.5       f(d) = 2 + 4.5 = 6.5      f(d) = 2 + 4.5 = 6.5


Closed
1                                  2                                3
                                    f(a)                            f(a)
                                                                     f(b)