Phase 3, Module 4: AI Search Algorithms
These algorithms are fundamental to classical AI and problem-solving. They are methods for finding a path from a starting point to a goal. While they differ from the statistical learning we've been discussing, they are a key part of the syllabus.
The Core Idea: Imagine you are in a maze. Search algorithms are different strategies for exploring the maze to find the exit.
Key Concepts:
State Space: The set of all possible positions in the problem (every possible junction in the maze).
Node: A specific state (your current location in the maze).
Edge: A connection between two nodes, representing a possible action (walking down a corridor to the next junction). Edges can have a cost or weight (the length of the corridor).
Frontier (or Open List): A list of nodes you have discovered but haven't visited yet.
Explored Set (or Closed List): A list of nodes you have already visited.
1. Uninformed Search: "The Blind Searchers"
These algorithms have no extra information about the maze. They don't know if one path is "more promising" than another. They just systematically explore.
Breadth-First Search (BFS)
Strategy: Explore the maze level by level.
Analogy: Ripples in a Pond
You drop a stone (your starting point) in a pond. BFS is like the first ripple, exploring all nodes at distance 1. Then the second ripple explores all nodes at distance 2, and so on. It checks all its neighbors before moving on to its neighbors' neighbors.
How it Works: It uses a Queue (First-In, First-Out) data structure for its frontier.
Add the start node to the queue.
Take the front node out of the queue.
Check if it's the goal. If yes, you're done!
If not, find all its unvisited neighbors and add them to the back of the queue.
Repeat until the queue is empty or the goal is found.
Guarantees:
Optimal? Yes, but only if all path costs are the same (e.g., every corridor is the same length). It will find the path with the fewest steps.
Complete? Yes. If a solution exists, BFS will find it.
Uniform Cost Search (UCS) / Dijkstra's Algorithm
Strategy: Explore the path with the lowest total cost so far.
Analogy: The Cautious Explorer with a GPS
You are at a junction. You have three paths: one is 1km long, one is 5km long, and one is 10km long. You don't just explore them in order; you always choose to explore the end of the cheapest path first, regardless of how many "steps" it takes.
How it Works: It uses a Priority Queue for its frontier. The priority is the cumulative cost from the start node.
Add the start node to the priority queue with a cost of 0.
Take the node with the lowest cost out of the priority queue.
Check if it's the goal. If yes, you're done! You've found the cheapest path.
If not, find all its neighbors. For each neighbor, calculate the new path cost (current node's cost + edge cost). If this path is cheaper than any previous path to that neighbor, add/update it in the priority queue.
Repeat.
Guarantees:
Optimal? Yes. It is guaranteed to find the path with the lowest total cost.
Complete? Yes.
BFS vs. UCS: The Key Difference
BFS finds the shortest path in terms of number of edges.
UCS finds the shortest path in terms of total path cost. If all edge costs are 1, UCS behaves exactly like BFS.
2. Informed Search: "The Smart Searcher"
This algorithm has extra information—a "hint" about how close it might be to the goal.
A* Search
Strategy: Combine the cautiousness of UCS with an optimistic guess.
Analogy: The Smart Tourist in a City
You want to get to the Eiffel Tower (Goal). You are at a crossroads (Node).
Like the Cautious Explorer (UCS), you know the exact distance you have already walked from your hotel (Start). Let's call this g(n).
You also have a map. You can draw a straight line from your current location to the Eiffel Tower. This is your "as the crow flies" distance. It's an optimistic guess, because you can't walk through buildings. This guess is called a heuristic, h(n).
To decide which path to take next, you combine these two values: f(n) = g(n) + h(n). You always choose to explore the node with the lowest f(n) score. This balances exploring paths that are already cheap (g(n)) with paths that seem to be getting you closer to the goal (h(n)).
How it Works: It's exactly the same as UCS (using a Priority Queue), but the priority is f(n) instead of just g(n).
The Heuristic h(n):
For A* to be optimal, the heuristic must be admissible, meaning it never overestimates the true cost. "As the crow flies" distance is a perfect example of an admissible heuristic.
Guarantees:
Optimal? Yes, if the heuristic is admissible.
Complete? Yes.
Efficiency: It is far more efficient than UCS because the heuristic guides it towards the goal, so it explores a much smaller portion of the maze.
Model
That is a superb and very insightful question. It hits on the most common point of confusion about A* search.
You are right that g(n) is a known value, but it is NOT a fixed constant. It is a variable because the path to reach a node n might change as the algorithm explores.
Let's use a real map with simple math to demonstrate this.
The Scenario: Finding the Cheapest Path from 'Start' to 'Goal'
Imagine this simple map. The numbers on the edges are the travel costs (e.g., fuel, time).
code
Code
B --5-- D
/| |\
1 | | 1
/ | | \
Start-1-A --9-- Goal
\ | | /
4 | | 1
\| |/
C --2-- E
The heuristic h(n) (our "as the crow flies" guess to the Goal) is given:
h(Start) = 10
h(A) = 9
h(B) = 7
h(C) = 8
h(D) = 1
h(E) = 1
h(Goal) = 0
We will use a Priority Queue (our Frontier) to keep track of nodes to visit. We will store them as (f(n), node_name), and always pull the one with the lowest f value. We also need a way to store the current best known cost g(n) for each node.
Step-by-Step A Search*
Step 1:
Start at Start.
g(Start) = 0.
f(Start) = g(Start) + h(Start) = 0 + 10 = 10.
Frontier: [(10, Start)]
g-costs known: {Start: 0}
Step 2:
Pop Start from the Frontier.
Explore its neighbors: A, B, C.
Path to A: g(A) = g(Start) + cost(Start, A) = 0 + 1 = 1.
f(A) = g(A) + h(A) = 1 + 9 = 10.
Path to B: g(B) = g(Start) + cost(Start, B) = 0 + 1 = 1.
f(B) = g(B) + h(B) = 1 + 7 = 8.
Path to C: g(C) = g(Start) + cost(Start, C) = 0 + 4 = 4.
f(C) = g(C) + h(C) = 4 + 8 = 12.
Frontier: [(8, B), (10, A), (12, C)] (B is first because it has the lowest f-score)
g-costs known: {Start: 0, A: 1, B: 1, C: 4}
Step 3:
Pop B from the Frontier.
Explore its neighbors: A, D.
Path to D: g(D) = g(B) + cost(B, D) = 1 + 5 = 6.
f(D) = g(D) + h(D) = 6 + 1 = 7.
Path to A: Here is the crucial part! We found a new path to A. Let's calculate its cost.
New g(A) via B: g(B) + cost(B, A) = 1 + (???) Let's say cost(B,A) is 5.
Let's add an edge B --5-- A to our map.
New g(A) via B = g(B) + cost(B, A) = 1 + 5 = 6.
Our old g(A) was 1.
Since 6 > 1, this new path is worse. We ignore it. We don't update g(A).
Frontier: [(7, D), (10, A), (12, C)] (D is now first)
g-costs known: {Start: 0, A: 1, B: 1, C: 4, D: 6}
Step 4:
Pop D from the Frontier.
Explore its neighbors: B, Goal.
Path to B: We've already visited B, so we ignore it.
Path to Goal: g(Goal) = g(D) + cost(D, Goal) = 6 + 1 = 7.
f(Goal) = g(Goal) + h(Goal) = 7 + 0 = 7.
Frontier: [(7, Goal), (10, A), (12, C)]
Step 5:
Pop Goal from the Frontier.
We have reached the goal! The algorithm terminates.
The optimal path found is Start -> B -> D -> Goal with a total cost (g(Goal)) of 7.
Why g(n) is a Variable Function
Let's imagine a slightly different scenario. What if the edge Start --1-- A was instead Start --8-- A?
In Step 2, the initial g(A) would have been 8.
f(A) = 8 + 9 = 17.
Then, in Step 3, when we explore from B, we would calculate the path to A via B:
New g(A) via B = g(B) + cost(B, A) = 1 + 5 = 6.
Our old g(A) was 8.
Since 6 < 8, we have found a cheaper path to A!
We would update our known best cost: g(A) is now 6.
We would update A in our Frontier with its new, better f-score: f(A) = 6 + 9 = 15.
Conclusion:
g(n) is a function, not a constant, because it represents the cost of the best path found so far from the start to node n. As the A* algorithm explores the graph, it can discover new, shorter paths to nodes it has already seen. When this happens, it updates g(n) to this new, lower value. This ability to update path costs is what guarantees that A* will find the true optimal path.
视频信息
答案文本
视频字幕
AI search algorithms are fundamental methods for finding paths from a starting point to a goal. Think of them as different strategies for exploring a maze to find the exit. The key concepts include state space, which represents all possible positions in the problem, nodes as specific states or locations, edges as connections between nodes, the frontier containing discovered but unvisited nodes, and the explored set of already visited nodes.
Breadth-First Search is an uninformed search algorithm that explores the maze level by level, like ripples expanding in a pond. It uses a First-In-First-Out queue data structure to maintain the frontier of nodes to explore. BFS systematically visits all nodes at distance one from the start, then all nodes at distance two, and so on. This guarantees finding the optimal path with the fewest steps, making it complete and optimal for unweighted graphs.
Uniform Cost Search extends breadth-first search by considering edge costs. Instead of exploring level by level, UCS always selects the frontier node with the lowest cumulative path cost. It uses a priority queue where nodes are ordered by their total cost from the start. This cautious approach guarantees finding the path with minimum total cost, making it both complete and optimal for weighted graphs. UCS is essentially Dijkstra's algorithm applied to search problems.
A-star search is an informed search algorithm that combines the cost-awareness of Uniform Cost Search with heuristic guidance toward the goal. The key formula is f of n equals g of n plus h of n, where g of n is the actual cost from start, h of n is the heuristic estimate to the goal, and f of n is the total estimated cost. For A-star to be optimal, the heuristic must be admissible, meaning it never overestimates the true cost. A classic example is straight-line distance, which is always less than or equal to the actual path distance.
Let's trace through A-star step by step using our example graph. We start at the Start node with f equals zero plus ten equals ten. We explore its neighbors: A with f equals one plus nine equals ten, B with f equals one plus seven equals eight, and C with f equals four plus eight equals twelve. B has the lowest f-value, so we explore it next. From B, we can reach D with cost six, giving f equals six plus one equals seven. D now has the lowest f-value in our priority queue. From D, we reach the Goal with total cost seven. Notice how g of n values can update when we discover better paths, which is crucial for A-star's optimality guarantee.