How often to look: sampling and the Nyquist rule
A robot sensor does not give you a smooth curve; it gives you a snapshot every so often. Turning a continuous physical quantity into a stream of evenly spaced numbers is called sampling, and the sampling rate is simply how many snapshots you grab each second, measured in hertz (Hz). A wheel encoder might be read 1000 times a second; a GNSS receiver might report only ten times a second. Choosing that rate is one of the first real decisions in any sensing system.
If you sample too slowly, you miss the fast wiggles, and they come back disguised as slow ones — a fake signal that was never really there. This trick of the eye is called aliasing. The rule that prevents it is the Nyquist criterion: to capture a signal honestly, you must sample at least twice as fast as the fastest thing happening in it. Want to catch a vibration that swings 50 times a second? Sample faster than 100 times a second — and in practice, well faster than that.
So the rate is not a free choice. Sample too slowly and you get aliasing and a robot that reacts late; sample too fast and you waste compute, fill memory, and pull in more electrical noise than you can use. The honest answer is to know the fastest motion you care about, then comfortably clear the Nyquist floor above it.
Smoothing out the jitter: the low-pass filter
Even sampled correctly, every reading carries sensor noise — small random wobbles riding on top of the true value. Plot a still accelerometer and the line is fuzzy, not flat. The classic cure is low-pass filtering: a recipe that lets the slow, real motion pass through while it blocks the fast, random fuzz. The name says it plainly — low frequencies pass, high frequencies are turned down.
The simplest version is the running average you already know in spirit: instead of trusting the latest number alone, you blend it gently with the recent past. A common one-line form keeps a single smoothed value and nudges it a little toward each new reading. The smaller that nudge, the calmer the output — and the more stubbornly it ignores noise.
# Exponential low-pass filter, one line of state # alpha in (0,1): small alpha = smoother but laggier smoothed = alpha * new_reading + (1 - alpha) * smoothed
Two weak answers, one strong one: sensor fusion
Filtering cleans one stream. But no single sensor is good at everything, and the deeper trick is to combine several. That is sensor fusion: merging readings from complementary sensors so each one covers the other's blind spot, producing one estimate better than any sensor alone.
The textbook pair lives inside an IMU. A gyroscope measures turning rate quickly and smoothly, but adding up its readings to track angle lets tiny errors pile up, so the estimate slowly wanders — this is drift. An accelerometer, meanwhile, can feel which way is down from gravity, giving a tilt that never drifts but is jumpy and useless during fast motion. Fuse them — gyroscope for the short term, accelerometer to anchor the long term — and you get an orientation that is both responsive and stable.
The same pattern repeats across robotics. Wheel encoders give smooth, high-rate motion but drift over distance; GNSS gives an absolute position that never drifts but arrives slowly and drops out under bridges or indoors. Fuse them and a delivery robot keeps a confident position even through a tunnel, because the encoders carry it while GNSS is silent and GNSS corrects it the moment it returns.
The frontier: learned and visual-inertial fusion
The classic Kalman filter assumes you can write down clean equations for how each sensor errs. Real cameras, LiDAR, and depth cameras are messier than that, so modern systems lean two ways. One blends camera images with IMU data into visual-inertial odometry — the inertial sensors smooth over fast jerks and brief visual blackouts, while the camera kills the long-term drift. This is how many drones and headsets know where they are without any GNSS.
The other direction is learned fusion: instead of hand-writing the error model, a neural network is trained on heaps of real data to combine raw sensor streams directly. It can absorb the weird, hard-to-write quirks of a real LiDAR in rain or a camera in glare — at the cost of needing data and being harder to explain when it fails.
Step back and the whole chapter forms one pipeline. Each sensor — whether proprioceptive, feeling the robot's own body, or exteroceptive, watching the outside world — hands over a raw, noisy stream. Sampling decides how often you look; filtering scrubs the jitter; fusion weaves the streams into one trustworthy estimate of where the robot is and what surrounds it. That clean estimate is precisely what perception, planning, and control build upon next.