Putting It Together: The Navigation Stack

Why one plan is never enough

Imagine asking a friend to walk across a busy office to fetch you a coffee. Before they take a step, they picture the whole route: out the door, down the hall, around the printer, into the kitchen. But they don't actually walk that picture stride for stride. As they go, a colleague steps out of a doorway, a chair is pushed back, a spill appears — and they adjust in real time without ever forgetting where they are ultimately headed. That is exactly the trick a mobile robot has to pull off, and it is the heart of this whole chapter.

Everything you have met so far — turning the world into a configuration space, separating free space from obstacle space, running collision checks, and choosing between Dijkstra, A*, a roadmap, or an RRT — was building toward a robot that can actually move through a messy, changing room. The catch is that no single planner does both jobs well. A far-sighted route over a big map is slow to compute; a split-second reaction to a moving chair has no time to think about the whole building.

So real robots cheat in the smartest possible way: they run two planners at once, at two different speeds, and glue them together with a constantly refreshed map of danger. That assembly is what engineers call the navigation stack, and learning to read it is the payoff for everything in this chapter.

The navigation stack: a layered pipeline

The navigation stack is the standard, layered pipeline that turns a goal like "go to the kitchen" into wheel commands. Think of it as a relay team. One runner holds the map and knows where the robot is. The next decides on a route. The next continually shapes that route into safe, achievable motion. The last hands raw velocity commands to the motors. Each runner is narrow and replaceable, which is why this layered design has become the default across the field.

More often than not, these layers live inside the Robot Operating System (ROS), the open-source plumbing that lets independent programs talk to each other. ROS gives each layer a standard way to pass messages — the map here, a route there, velocity commands out the end — so a team can swap in a better local planner without rewriting the rest. It is less a single program than a marketplace of cooperating parts.

One vocabulary note before we go deeper. Up to now the planners produced a path — a pure shape in space. The navigation stack ultimately needs a trajectory: that same shape with timing and speed attached, so the wheels know not just where to go but how fast. The control layer is where a path finally becomes motion.

Two planners, two speeds

The beating heart of the stack is the split between a global planner and a local planner. They divide the same problem along the axis of time and detail. The global planner is the strategist; the local planner is the tactician. Neither could do the other's job.

The global planner computes a full route from start to goal across the whole known map. It runs a graph search — usually A* or Dijkstra over the map's grid cells — and produces a long, far-sighted path. It is allowed to be slow, because it only needs to re-run when the goal changes or the route is badly blocked. Crucially, it only knows about obstacles already drawn on the map; the person who just walked in front of the robot is invisible to it.

The local planner is the opposite. Many times a second, it looks only at a small window around the robot, sees whatever the sensors just reported, and continually re-plans a short segment that hugs the global route while dodging anything new. Because it lives close to the wheels, it must respect how the robot actually moves: a car-like base cannot slide sideways, so its segment has to be dynamically feasible. This is exactly the holonomic vs nonholonomic distinction biting in practice — a constraint the far-off global planner can mostly ignore but the local planner never can.

The costmap: a living picture of danger

What do both planners actually search over? Not the raw map, but a costmap — a grid where every cell carries a number saying how dangerous or expensive it is to occupy. Empty floor costs nothing; a wall is impassable (effectively infinite cost); the band of cells just around the wall costs a lot but not infinitely. A planner then looks for the cheapest route through this terrain of numbers, naturally preferring the open middle of a corridor over scraping along the edge.

Two ideas make the costmap genuinely alive. First, inflation: every obstacle is smeared outward by a halo of rising cost. This is the costmap's way of remembering that the robot is not a point — it has a body with width. By baking the robot's size into the map as a margin, the planners can keep treating the robot as a dot and still never clip a corner. It is the practical cousin of growing obstacles in configuration space.

Second, fading: readings get old. A person who blocked a doorway a moment ago has walked on, so a sensor mark that nothing has confirmed lately should be cleared, letting the cell go cheap again. A good costmap continually adds fresh obstacles where the sensors see them and clears stale ones where the sensors now see empty space. There are usually two of these maps — a large static one for the global planner and a small rolling one centered on the robot for the local planner — each refreshed from the same live sensor stream.

GLOBAL LOOP  (slow, e.g. on new goal):
  global_path = A_star(costmap_global, start, goal)

LOCAL LOOP   (fast, ~10-20x per second):
  costmap_local = update(sensor_scan)      # add new, clear stale
  segment = best_feasible_arc_toward(global_path, costmap_local)
  cmd_velocity = to_trajectory(segment)    # path + speed -> motion
  send(cmd_velocity)

Pseudo-code: a far-sighted global loop and a fast reactive local loop sharing one live costmap.

Where it breaks, and what comes next

This architecture is sturdy, but it fails in recognizable ways, and knowing them is half of using it well.

Local minima. A reactive local planner can steer the robot into a dead-end pocket — say a U-shaped clutter of chairs — where every direction looks worse than standing still, so it freezes. This is the same trap potential-field methods are famous for. The usual cure is to let the global planner re-route, or to trigger a recovery behavior like backing up and rotating in place.
Costmap lag. If sensor data is slow or the map updates late, the robot acts on a picture of the world that is already wrong — braking for a ghost obstacle that has moved, or worse, missing a real one. Tuning how fast cells inflate and fade is a constant balancing act between twitchy and reckless.
Bad localization. The whole stack assumes the robot knows where it is on the map. If localization drifts, the costmap lines up against the wrong walls and a perfectly good plan drives the robot into a real one.

The frontier is to soften the hard line between these layers. Instead of hand-tuned planners and a hand-designed costmap, learning-based planners train a single model to map raw sensor input straight to motion, absorbing both the route-finding and the dodging from experience or demonstration. They promise smoother, more human-looking navigation in crowds — at the cost of being harder to verify and trust. For now, the layered global-plus-local stack remains the reliable workhorse, and learned approaches most often graft onto it rather than replace it.