From Pick-and-Place to In-Hand Dexterity

The Workhorse: The Pick-and-Place Cycle

Walk into almost any modern factory or warehouse and you will see the same motion repeated millions of times a day: a robot reaches for an object, closes its hand, lifts, carries, and releases. This is pick-and-place, the bread-and-butter task of industrial robotics. A part arrives at a known spot, the robot picks it up, and it sets the part down somewhere else — onto a conveyor, into a box, onto a circuit board. It sounds trivial, but almost everything a manipulation robot knows comes together here.

Under the hood, a single pick-and-place cycle is a short pipeline. First the robot locates the object — sometimes with a camera, sometimes by relying on a fixture that holds the part in a known pose. Then it plans a grasp: where on the object to make contact, and how to close the fingers so the part will not slip. It executes a pre-grasp and approach — moving the hand to a staging point just above the object, then descending straight down so it does not knock the part sideways. Finally it closes the gripper, lifts, moves along a planned path, and places the part.

Reaching Into the Mess: Bin Picking

Now imagine the parts are not laid out neatly but dumped into a tote — dozens of identical bolts, or a jumble of mixed products, overlapping and tangled at every angle. Picking one out at a time is called bin picking, and it is one of the classic hard problems of industrial robotics. The robot can no longer assume where anything is. It has to look into the clutter, decide which object to grab, find a graspable spot on it, and pull it free without dragging its neighbors along.

Perception does the heavy lifting. A depth camera — an RGB-D camera that records both color and distance — produces a point cloud, a dense field of 3D points tracing the surface of the pile. From that, modern systems propose candidate grasps directly: a learned model scores thousands of possible hand placements and picks the one most likely to succeed. Each candidate is checked against a grasp quality metric — will the fingers actually hold the object, and is the hand clear of collisions with the bin walls and the other parts?

The choice of hand matters enormously here. A suction gripper — a vacuum cup pressed against a flat face — is wonderfully forgiving in clutter: it does not need to wrap around the object, just touch one smooth surface. That is why so many warehouse bin-picking cells use suction for boxes and bagged goods. A two-finger jaw gripper grips more securely and handles odd shapes, but it needs room to open its fingers on either side of the part — room that a tight pile may not offer.

Beyond Grab-and-Drop: Dexterous and In-Hand Manipulation

Pick-and-place and bin picking share a hidden assumption: once the object is grasped, it stays frozen in the hand until release. But humans do something far richer. Pick up a pen, and without thinking you twirl it in your fingers, walk it from grip to grip, flip it end over end. That is dexterous manipulation — using a multi-fingered hand to control an object's motion through fine, coordinated finger contacts rather than just clamping it.

The sharpest version is in-hand manipulation: reorienting an object within the hand without ever setting it down. Why bother? Because the pose in which you grasp an object is rarely the pose you need to use it in. You might pick up a screwdriver by its shaft, then have to slide your grip to the handle before driving a screw. Setting it down and re-grasping wastes time and can fail; doing it in the hand is fast and fluid — if the robot can manage it.

Dexterity also blurs the line between holding and pushing. Sometimes the smartest move is not to grasp at all but to nudge: sliding a flat object to the table's edge so a finger can get under it, or tipping it upright against a wall. That is non-prehensile manipulation — controlling an object through pushes, pivots, and rolls instead of an enclosing grip. Real dexterous behavior weaves grasping and pushing together fluidly.

Why In-Hand Reorientation Is So Hard

In-hand manipulation is brutally difficult for a few intertwined reasons. To move an object you must break and remake contacts — let one finger go while others hold — and the instant you do, the object can shift, tilt, or drop. Each contact involves friction, deformation, and rolling that are hard to model precisely. And a multi-finger hand has many joints to coordinate at once. It is the realm of contact-rich manipulation, where success depends on the messy physics happening at the fingertips, not on a clean geometric plan.

Stable holding leans on the same ideas as a good grasp. The fingers should achieve force closure — using friction, they can resist any push or twist on the object — and as contacts shift, the hand must keep re-establishing it so the object never escapes. Throughout, the robot wants enough grip to hold on but not so much that it crushes a delicate part, which is where force control and the related impedance control earn their keep: they regulate how hard the fingers press rather than just commanding a position.

Because the contact physics defies exact modeling, the breakthrough results in recent years came from learning. Researchers train a control policy — a mapping from what the hand senses to how it should move its joints — inside a fast physics simulator, running millions of practice trials that a robot could never afford in the real world. To survive the jump to hardware they lean on domain randomization, deliberately varying friction, object weight, and sensor noise in simulation so the policy does not overfit to one perfect virtual world. That, plus tactile sensors that let the fingertips feel slip, is how a robot hand learned to rotate a cube or reposition a tool entirely in-hand.