Policy Decorator (Ours) vs. Base Policy

Our refined policy, learned through Policy Decorator, achieves high success rates while retaining the base policy's strengths. We compare it to the base Diffusion Policy to show how our refined policy navigates the task's hardest parts, where the base policy struggles.

Key Observations

In challenging tasks requiring precise control, the offline-trained base policy often fails in critical areas. In contrast, Policy Decorator refines the base policy in these areas while staying aligned overall. We highlight these improvements by slowing down the video, marking the critical regions, and showing how Policy Decorator enhances precision and control.

Peg Insertion Task

The Peg Insertion task requires precise manipulation with only 3mm of clearance. The base policy can bring the peg near the hole but fails to align it for insertion. In contrast, our refined Policy Decorator policy accurately inserts the peg while maintaining smooth motion.

Turn Faucet Task

In the Turn Faucet task, the base policy nearly completes the task but misses the faucet handle. In contrast, Policy Decorator refines the handle-turning phase, ensuring successful contact while leaving the rest of the trajectory unchanged.