Touching and Seeing Distance: Force, Touch, LiDAR, Depth, and Sonar

The Sense of Touch: Force and Contact

Earlier in this track we met the robot's inner ear — the sensors that report its own joints and motion. Those are proprioceptive: they look inward. This guide is about the opposite, the exteroceptive sensors that look outward at the world. The most intimate of these is touch. When a robot's hand presses on something, how hard is it pressing? Is it about to crush the egg, or is the egg slipping away?

The big one lives at the wrist: the force/torque sensor. Picture a small puck bolted between the robot's arm and its hand. It measures six numbers at once — push or pull along three directions (X, Y, Z) and twist around those same three axes. Together those six numbers describe every way the outside world can shove the hand. If the robot pushes a peg into a hole and the peg jams, the wrist feels the sideways force and the controller can wiggle to find the opening — the way you'd feel a key catching in a lock.

Closer to the skin sits the tactile sensor — a grid of tiny pressure points spread across a fingertip or palm, like a low-resolution touchscreen for the robot's skin. Where the wrist sensor reports one overall force, a tactile pad reports a little map: which part of the finger is touching, how the pressure is shaped, and whether that pressure is sliding (a sign the object is about to slip). This is the difference between knowing your hand is being pushed and knowing exactly where a coin sits on your palm.

Seeing Distance with Light

Touch only tells you about things you are already pressing against. To plan ahead, a robot needs to measure distance from afar — and the fastest messenger we have is light. The classic light-based ranger is LiDAR (Light Detection and Ranging). It fires a brief laser pulse, waits for the reflection to bounce back, and times the round trip. Light travels about 30 centimeters every nanosecond, so the timer must be very precise, but the idea is just an echo made of light: closer surfaces echo sooner.

One pulse measures one direction. To map a whole room, the LiDAR sweeps its beam — spinning around, or flicking across the scene — taking thousands of measurements per second. The result is a point cloud: a swarm of 3D dots, each one a spot where the laser hit a real surface. Stand a spinning LiDAR in a hallway and the point cloud traces out the walls, the doorways, a person walking by — a faithful skeleton of the geometry, with no color and no labels, just shape.

A camera gives the opposite: lots of color, but normally no distance. The RGB-D / depth camera fixes that. "RGB" is the ordinary color image; the "D" is a depth value attached to every single pixel, so each point in the picture also says how far away it is. Some depth cameras project an invisible infrared dot pattern and watch how it warps on nearby surfaces; others time their own light like a tiny LiDAR per pixel; others use two lenses and triangulate, the same trick as stereo vision and the same trick your two eyes use. The payoff is a colored point cloud: shape and appearance together, perfect for spotting and grasping objects on a table.

Sound for Distance, Satellites for Position

Light is not the only echo a robot can use. The humble ultrasonic (sonar) sensor chirps a sound too high for us to hear, then listens for the echo and times it — exactly how a bat finds a moth, or how a car's parking sensors beep faster as the wall gets closer. Sound is roughly a million times slower than light, which is actually a gift here: the round trip takes milliseconds instead of nanoseconds, so the electronics can be cheap and simple. A sonar costs a few coins, sips power, and reliably answers "is something within a couple of meters?" — it just can't tell you a precise shape, only a rough distance to whatever is in front.

Touch, light, and sound all answer "how far is that thing from me?" But an outdoor robot has a different question: "where on Earth am I?" That is the job of GNSS / GPS. A constellation of satellites overhead each broadcasts the time and its own position. The receiver hears several at once and, from the tiny differences in how long each signal took to arrive, solves for its own spot on the globe — like working out where you stand from how long several distant church bells take to reach you.

Plain GPS pins you to within a few meters — fine for a delivery rover deciding which street it's on, useless for parking between two lines. With extra corrections (a technique called RTK), GNSS can sharpen to a few centimeters, which is why self-driving cars and field-mapping farm robots lean on it. The catch is that the satellite signals are faint and need a clear view of the sky, so GNSS goes blind indoors, in tunnels, and among tall buildings — exactly where you'd want LiDAR and cameras to take over.

Choosing the Right Sense for the Job

No outward-looking sensor is good at everything; each has a blind spot built into its physics. Knowing those blind spots is half of designing a robot. Glass is a famous trap: a LiDAR's laser passes straight through a window or bounces off a mirror, so the robot "sees" the room beyond instead of the pane in front of it. Fog, rain, and dust scatter laser light too, smearing the point cloud. And bright sunlight can drown out a depth camera's faint infrared pattern, which is why cameras that dazzle indoors often go half-blind on a patio at noon.

Indoors, need shape and color up close (e.g. picking objects off a table): reach for an RGB-D depth camera.
Outdoors or large spaces, need long range and precise geometry (e.g. a car or a survey robot): reach for LiDAR.
Just need a cheap "is anything close?" bumper sense (e.g. a vacuum robot avoiding walls): a few ultrasonic sonars do the job.
Need to grasp gently or fit parts together by feel (e.g. inserting a plug): use a wrist force/torque sensor, and tactile pads for fine contact.
Need to know where on Earth you are, with open sky (e.g. a delivery rover or farm tractor): use GNSS/GPS, with RTK if you need centimeters.

The deeper lesson is that you rarely choose just one. Because every sensor fails in its own way, robots combine them so that one sensor covers another's blind spot — sonar catching the glass door the LiDAR missed, the camera giving color the LiDAR lacks, the wheel encoders carrying on when GPS drops out in a tunnel. Blending these streams into one trustworthy picture is its own craft, called sensor fusion, and it is where the next guides in this track are headed.