NMPC-based Motion Planning with Adaptive Weighting for Dynamic Object Interception

Chen Cai    Saksham Kohli    Steven Liu Department of Electrical and Computer Engineering, University of Kaiserslautern-Landau, 67663 Kaiserslautern, Germany
e-mails: {chen.cai, kohli, steven.liu}@rptu.de
Abstract

This paper presents a nonlinear Model Predictive Control (MPC) planner for dynamic object interception using cooperative manipulator systems under closed-chain constraints. We introduce an Adaptive-Terminal (AT) formulation that employs cost shaping to mitigate actuator power violations common in Primitive-Terminal (PT) approaches. Experimental validation on a physical platform demonstrates superior motion quality and robustness compared to the PT baseline. Crucially, the system exhibits excellent real-time performance, achieving an average computation time of 19ms19\,\text{ms}–less than half the 40ms40\,\text{ms} sampling interval. This establishes the framework’s suitability for agile, safety-critical cooperative tasks.

keywords:
Motion Planning, Multi-Robot Systems, Model Predictive Control (MPC), Robotic Catching
thanks: This work has been submitted to the IFAC World Congress for possible publication. Under review.

1 Introduction

Catching fast-moving objects mid-air is a canonical benchmark for robotic agility and coordination. This demanding task necessitates closing the perception-prediction-planning-control loop in real-time under tight timing constraints. Beyond its fundamental research value, dynamic interception finds new relevance in collaborative manufacturing, embodying Industry 5.0 principles. Such non-contact transfer methods exemplify technology that supports rather than constrains human workers, contrasting with traditional handover interfaces requiring precise spatial alignment or physical contact.

This work employs two coordinated manipulators, abstracting the human upper body to explore dynamic behaviors like catching–signatures of embodied intelligence. Functionally, this cooperative setup enables richer interception geometries, secure cradle-like grasps, enhancing robustness and safety. However, coordination introduces significant challenges: closed-chain geometry, inter-arm clearance, and kinematic redundancy, making control substantially more complex than single-arm cases. Addressing dynamic catching on such platforms is thus not merely harder, but a necessary step toward foundational, human-like intelligent behaviors for next-generation humanoid and collaborative robots.

To achieve dynamic ball catching with coordinated manipulators, we propose an integrated perception-prediction-planning-control pipeline, experimentally validated. At its core lies an NMPC-based motion planner generating coordinated trajectories for the closed-chain system. Our approach explicitly enforces kinematic consistency and joint limits. Operating within a predefined safety envelope circumvents complex analytical workspace characterization. Dynamic interception demands balancing rapid convergence with motion smoothness for stability and constraint adherence. Addressing this trade-off, we introduce an Adaptive-Terminal (AT) formulation. AT dynamically modulates pose tracking cost weights based on state error. This implicitly regularizes control effort for smoother joint velocities, critically mitigating power limit violations-a frequent challenge with aggressive, fixed-weight approaches in such systems. Driven by a single depth camera and a Kalman filter predictor, system efficacy is validated via simulation and experiments, quantifying success rates, latency, and coordination fidelity (e.g., error in maintaining the fixed grasping distance).

2 Related Work

Research in the domain of robotic catching in flight has established a common methodological pipeline (frese2001off), typically involving four sequential stages: 1) predicting the object’s trajectory based on sensor measurements, 2) selecting a feasible interception point and computing the corresponding desired robot posture 3) planning a time-critical motion to reach this posture for a successful catch, and 4) motion control to execute the planned trajectories. While this framework has been extensively explored for single-arm manipulators (bauml2010kinematically), the research landscape is evolving to address more complex platforms such as mobile manipulators (dong2020catch) and humanoids (kober2012playing; bauml2011catching), which necessitate whole-body coordination.

Methodologically, current approaches are broadly categorized into model-based optimal control and data-driven learning paradigms. Learning-based methods, ranging from imitation learning using human demonstrations to reinforcement learning (RL) (abeyruwan2023sim2real), can learn complex dynamics and adapt to real-world physical nuances (yu2021neural). For instance, kim2014catching utilized a learning-by-demonstration approach, combining probabilistic models for grasp configurations with dynamical systems (DS) to generate coordinated arm-hand motions for catching irregularly shaped objects. However, these methods are often requiring extensive data and may struggle with robustness when faced with significant distributional shifts beyond their training data (marwan2021comprehensive). On the other hand, model-based approaches, particularly Model Predictive Control (MPC), offer strong real-time performance, zero-shot adaptation to new targets, and explicitly constraints handling, although they are sensitive to modeling errors. A recent comprehensive study by abeyruwan2023agile compared a whole-body MPC controller with a blackbox learning policy, concluding that while the fine-tuned learning agent achieved higher peak performance, the MPC-based agent demonstrated substantially greater robustness to out-of-distribution throws.

While the aforementioned methods advance robotic agility, they predominantly focus on single-arm systems. Furthermore, to achieve a successful catch, these works often replace the robot’s standard gripper with a specialized, custom-designed end-effector, such as a Lacrosse head or a bespoke catching tool. Our work, in contrast, is motivated by the objective of using standard, commercial-off-the-shelf (COTS) manipulators with their original, general-purpose grippers. As these standard grippers are ill-suited for directly intercepting a ball, a practical alternative is a tool-based strategy: the manipulators grasp a separate, passive catcher (e.g., a container or net). However, stably manipulating such a tool at high accelerations is physically problematic for a single arm. Consequently, a cooperative bimanual grasp is necessary to ensure a stable, multi-point hold on the catcher. While this bimanual, tool-based approach enables robust, ”cradle-like” interception using only COTS hardware, it fundamentally transforms the control problem by introducing a rigid closed-chain constraint.

Our work addresses the challenging, yet less explored, problem of dynamic interception using two cooperative manipulators forming a closed kinematic chain. This configuration introduces fundamental difficulties rendering traditional pipeline suboptimal. Firstly, characterizing the cooperative task space and selecting an optimal catching pose a priori becomes largely intractable. The reachable workspace for the jointly held object is a coupled, constrained manifold (he2022distributed), not merely a union of individual workspaces. Furthermore, combined redundancy means numerous joint configurations can realize a target pose, each with vastly different dynamic implications, making predefined optimal pose selection ill-posed. Secondly, the closed-chain constraint imposes stringent coordination requirements for motion planning and execution, alongside practical deployment challenges. Aggressive interception maneuvers demand significant instantaneous power, frequently triggering power limit violation faults on collaborative platforms. This issue is particularly exacerbated in closed-chain systems, where slight misalignments or tracking errors induce internal forces, increasing power consumption. Explicitly modeling such complex dynamics, like power limits, as constraints within real-time NMPC frameworks remains computationally prohibitive.

To overcome these limitations, we propose a nonlinear MPC framework that holistically addresses the planning problem for cooperative, closed-chain interception. By operating within a safety envelope, our approach circumvents deriving complex analytical models of the cooperative workspace. Based on the predicted ball trajectory, a target selection algorithm identifies a desired interception pose prioritized for kinematic feasibility. This pose then serves as the target for the catcher within the NMPC, which subsequently optimizes the coordinated joint trajectories for both manipulators to maneuver the catcher to that pose. This implicitly resolves system redundancy within the optimization horizon based on task objectives and constraints, avoiding the difficult a priori selection of a single optimal system configuration. Furthermore, the introduced Adaptive-Terminal (AT) formulation dynamically adjusts control objectives, providing a practical means to balance rapid interception with the smooth, coordinated motion required to mitigate execution challenges like power limit violations inherent in such demanding dynamic closed-chain tasks.

3 Problem Formulation

This section delineates the problem of cooperative robotic ball catching. The system architecture and relevant preliminaries are established first, leading to a formal statement of the motion planning task.

3.1 System Overview

The focus of this work is the dynamic task of catching a thrown ball using a system of multiple coordinated robots, as conceptually illustrated in Fig. 1. The task involves a human agent or a mechanical thrower launching an object along a ballistic trajectory. A key challenge is the inherent uncertainty of this task: the trajectory is non-deterministic, featuring randomized initial velocities and launch angles that result in interception points within a predefined operational workspace. This variability necessitates a motion planner capable of real-time adaptation and continuous re-planning, rather than executing a static, pre-computed path. The robotic system, composed of two 7-DoF robots rigidly holding a catcher–which in this work takes the form of a container–must visually perceive the object’s flight, plan a suitable interception maneuver, and execute a coordinated motion to catch it before it lands.

Refer to caption
Figure 1: Conceptual overview of the robotic ball-catching task. A perception system observes the thrown ball’s trajectory, enabling the cooperative robotic system to plan and execute a coordinated in-flight interception.

To formally describe the system, we denote the joint position for the left and right manipulators as 𝒒l(t),𝒒r(t)7\bm{q}_{l}(t),\bm{q}_{r}(t)\in\mathbb{R}^{7}, respectively, stacked into the overall system configuration 𝒒(t)=[𝒒l(t)T,𝒒r(t)T]T14\bm{q}(t)=[\bm{q}_{l}(t)^{T},\bm{q}_{r}(t)^{T}]^{T}\in\mathbb{R}^{14}. Consequently, the system’s joint velocities and accelerations are given by 𝒒˙(t)\dot{\bm{q}}(t) and 𝒒¨(t)\ddot{\bm{q}}(t). The thrown ball’s state at time tt consists of its Cartesian position 𝒑b(t)3\bm{p}_{b}(t)\in\mathbb{R}^{3} and velocity 𝒗b(t)3\bm{v}_{b}(t)\in\mathbb{R}^{3}. The predicted velocity at interception, 𝒗b(tcatch)\bm{v}_{b}(t_{\text{catch}}), dictates the primary alignment requirement. This defines the impact direction unit vector 𝒅impact=𝒗b(tcatch)/𝒗b(tcatch)\bm{d}_{\text{impact}}=\bm{v}_{b}(t_{\text{catch}})/||\bm{v}_{b}(t_{\text{catch}})||, which, along with task-specific constraints (e.g., keeping the catcher opening upwards in this work), determines the desired catcher orientation represented by the unit quaternion ϕb\phi_{b}.

3.2 Preliminaries

The dynamics of each manipulator (i{l,r}i\in\{l,r\}) are governed by the standard rigid-body equation of motion cai2025mpc:

𝑴i(𝒒i)𝒒¨i+𝑪i(𝒒i,𝒒˙i)𝒒˙i+𝒈i(𝒒i)=𝝉i+𝝉ext,i\bm{M}_{i}(\bm{q}_{i})\ddot{\bm{q}}_{i}+\bm{C}_{i}(\bm{q}_{i},\dot{\bm{q}}_{i})\dot{\bm{q}}_{i}+\bm{g}_{i}(\bm{q}_{i})=\bm{\tau}_{i}+\bm{\tau}_{\text{ext},i} (1)

Here, 𝝉i7\bm{\tau}_{i}\in\mathbb{R}^{7} is the vector of actuator torques. The matrices 𝑴i\bm{M}_{i}, 𝑪i\bm{C}_{i}, and the vector 𝒈i\bm{g}_{i} are the configuration-dependent inertia, Coriolis/centrifugal, and gravity terms, respectively.

For real-time tractability, a hierarchical control structure is adopted, separating high-level kinematic motion planning from low-level dynamic control in (1). The NMPC motion planner (high-level) operates based on a simplified kinematic model, specifically a double integrator,

𝒒¨(t)=𝒖(t)\ddot{\bm{q}}(t)=\bm{u}(t) (2)

where the control input 𝒖(t)14\bm{u}(t)\in\mathbb{R}^{14} represents the joint accelerations. This formulation relies on two key assumptions: (i) a low-level controller can accurately track the reference accelerations, and (ii) the dynamics of the lightweight container are negligible.

As depicted in Fig. 2, we define a world frame {W}\{W\} and attach frames to the base {Bi}\{B_{i}\}, end-effector {Ei}\{E_{i}\} for each robot i{l,r}i\in\{l,r\}, and the catcher {O}\{O\}. The forward kinematics mapping, denoted by 𝒉i()\bm{h}_{i}(\cdot), relates the joint configuration 𝒒i\bm{q}_{i} of an arm (i{l,r}i\in\{l,r\}) to the Cartesian pose of its end-effector, 𝒙Ei=[𝒑EiT,ϕEiT]TSE(3)\bm{x}_{E_{i}}=[\bm{p}_{E_{i}}^{T},\bm{\phi}_{E_{i}}^{T}]^{T}\in SE(3), where 𝒑Ei\bm{p}_{E_{i}} is the position and ϕEi\bm{\phi}_{E_{i}} is the orientation (unit quaternion). The corresponding differential kinematics are given by 𝒙˙Ei=𝑱i(𝒒i)𝒒˙i\dot{\bm{x}}_{E_{i}}=\bm{J}_{i}(\bm{q}_{i})\dot{\bm{q}}_{i}, where 𝒙˙Ei\dot{\bm{x}}_{E_{i}} is the end-effector twist (linear velocities and quaternion rates) and 𝑱i(𝒒i)\bm{J}_{i}(\bm{q}_{i}) is the manipulator Jacobian.

The rigid grasp on the container forms a closed kinematic chain. Let the constant transformations from each end-effector frame to the catcher frame be 𝑻OEl{}^{E_{l}}\bm{T}_{O} and 𝑻OEr{}^{E_{r}}\bm{T}_{O}. This allows the catcher pose, 𝒙o=[𝒑oT,ϕoT]T\bm{x}_{o}=[\bm{p}_{o}^{T},\bm{\phi}_{o}^{T}]^{T}, to be expressed as a function of either arm’s joint configuration through a composite forward kinematics mapping, which we denote as 𝒉o()\bm{h}_{o}(\cdot). This relationship is established via the composition of homogeneous transformations:

𝑻OW=𝑻EiW(𝒒i)𝑻OEi{}^{W}\bm{T}_{O}={}^{W}\bm{T}_{E_{i}}(\bm{q}_{i})\cdot{}^{E_{i}}\bm{T}_{O} (3)

The mapping 𝒙o=𝒉o(𝒒i)\bm{x}_{o}=\bm{h}_{o}(\bm{q}_{i}) thus represents the forward kinematics from a single arm’s joint space to the commonly held catcher task space. As the catcher pose must be unique regardless of which arm is used for computation, this induces the fundamental closed-chain constraint that must hold throughout the motion:

𝒉o(𝒒l)=𝒉o(𝒒r)\bm{h}_{o}(\bm{q}_{l})=\bm{h}_{o}(\bm{q}_{r}) (4)

This equality couples the configurations of both arms and defines the feasible manifold where the system operates.

Refer to caption
Figure 2: Coordinate system definitions for the multi-robot experimental setup.

3.3 Problem Statement

The central problem is to compute a motion plan satisfying kinematic constraints and actuator limitations for the cooperative multi-arm system to intercept a thrown ball. Given the initial state (𝒒(t0),𝒒˙(t0))(\bm{q}(t_{0}),\dot{\bm{q}}(t_{0})) and the ball’s predicted trajectory (𝒑^b(t),𝒗^b(t))(\hat{\bm{p}}_{b}(t),\hat{\bm{v}}_{b}(t)), the task is formulated as an Optimal Control Problem (OCP). The objective is to find a control input 𝒖(t)=𝒒¨(t)\bm{u}(t)=\ddot{\bm{q}}(t) that minimizes a cost function JJ:

min𝒖(t)J=Φ(𝒙o(𝒒(tf)),𝒑^b(tf),ϕ^b(tf))+t0tfL(𝒒˙(t),𝒖(t))𝑑t\min_{\bm{u}(t)}J=\Phi(\bm{x}_{o}(\bm{q}(t_{f})),\hat{\bm{p}}_{b}(t_{f}),\hat{\bm{\phi}}_{b}(t_{f}))+\int_{t_{0}}^{t_{f}}L(\dot{\bm{q}}(t),\bm{u}(t))dt (5)

where the running cost L()L(\cdot) penalizes excessive joint velocities and accelerations to promote smooth motions. The terminal cost Φ()\Phi(\cdot) should serve as a heavily weighted soft constraint, penalizing the final discrepancy between the catcher pose and the desired interception pose to drive the system towards the goal configuration.

This optimization is subject to the following constraints:

  1. 1.

    System Dynamics: The system evolves according to the double integrator model 𝒒¨(t)=𝒖(t)\ddot{\bm{q}}(t)=\bm{u}(t), starting from the given initial state (𝒒(t0),𝒒˙(t0))(\bm{q}(t_{0}),\dot{\bm{q}}(t_{0})).

  2. 2.

    Actuator Limits: Joint position, velocity, and acceleration limits must be respected, e.g., 𝒒(t)[𝒒min,𝒒max]\bm{q}(t)\in[\bm{q}_{\min},\bm{q}_{\max}].

  3. 3.

    Closed-Chain Constraint: The coordination constraint 𝒉o(𝒒l(t))=𝒉o(𝒒r(t))\bm{h}_{o}(\bm{q}_{l}(t))=\bm{h}_{o}(\bm{q}_{r}(t)) must be satisfied.

  4. 4.

    Workspace and Self-Collision Constraints: Motion must remain within a predefined safe workspace. Dedicated constraints ensuring self-collision avoidance (e.g., minimum distance between links) must be satisfied.

Employing a kinematic planner based on Eq. (2) offers significant advantages in computational tractability, enabling the real-time performance crucial for dynamic interception tasks. However, the application of this kinematic planning approach within the demanding context of dynamic interception using an inherently stiff, closed-chain system presents significant challenges, particularly as these systems are often rendered dynamically stiff by the high-gain low-level controllers required for precise trajectory tracking. This inherent stiffness means even minor discrepancies between the arms or deviations from the planned path can induce large internal forces. Consequently, there is a heightened risk of triggering actuator power limit violations–critical safety features inherent in most commercial collaborative robots designed to halt motion upon detecting excessive loads or potential collisions. Since incorporating full dynamic constraints to prevent this is often intractable for real-time OCP, a specialized motion planning approach is required to balance rapid task execution with these intrinsic system limitations, as detailed in the following section.

4 Proposed Motion Planner

To addressing the closed-chain coordination and dynamic feasibility challenges outlined in Section 3.3, we propose the motion planning framework depicted in Fig. 3. This hierarchical system consists of three main stages: (i) ball trajectory estimation and target point selection, processing visual input to determine a feasible interception pose; (ii) the core nonlinear MPC-based motion planner, generating coordinated joint trajectories towards this target; and (iii) the low-level control layer, responsible for real-time trajectory execution using joint PD controllers. This modular design advantageously decouples planning from execution. Section IV-A details the perception stage, followed by the proposed MPC formulation in Section IV-B.

Refer to caption
Figure 3: The architecture of the system in the form of diagram

4.1 Ball Trajectory Estimation and Target Point Selection

The perception pipeline initiates by processing RGB-D images via a YOLOv8-based object detector to obtain initial estimates of the ball’s 3D position. Robustness against high-speed motion and diverse conditions was ensured through tailored training (details in supplementary material). This stage yields a stream of time-stamped, potentially noisy, 3D position measurements of the ball in the world frame.

These position measurements are subsequently processed by a Kalman filter. Employing a discrete-time projectile motion model (including gravity and process noise for unmodeled dynamics), the filter provides smoothed estimates of the ball’s full state (position and velocity). Crucially, these current state estimates serve as the initial condition for predicting the ball’s future flight trajectory, denoted by 𝒙^b(t)=[𝒑^b(t)T,𝒗^b(t)T]T6\hat{\bm{x}}_{b}(t)=[\hat{\bm{p}}_{b}(t)^{T},\hat{\bm{v}}_{b}(t)^{T}]^{T}\in\mathbb{R}^{6}, by iteratively propagating the motion model forward. This predicted trajectory serves as the input for the subsequent target point selection.

From the predicted trajectory, a target point selection algorithm identifies a feasible interception pose 𝒑tg\bm{p}_{\text{tg}} for the subsequent MPC planner. This involves filtering points within a safe workspace envelope 𝒮\mathcal{S} and selecting the candidate that minimizes the required average velocity for interception, thus balancing responsiveness and kinematic feasibility111Detailed algorithm and source code available at:
https://github.com/Ashray4/nmpc_ada_ballCatching

4.2 Motion Planning

4.2.1 Primitive MPC Formulation (PT)

Based on (2), we define the state vector 𝒛(k)=[𝒒(k)T,𝒒˙(k)T]T2n\bm{z}(k)\!=\![\bm{q}(k)^{T},\dot{\bm{q}}(k)^{T}]^{T}\in\mathbb{R}^{2n} and the input vector 𝒖(k)=𝒒¨(k)n\bm{u}(k)=\ddot{\bm{q}}(k)\in\mathbb{R}^{n}, including all n=14n=14 joints. Discretized with step TsT_{s}, system dynamics is described by:

𝒛(k+1)=[𝑰nTs𝑰n𝟎n𝑰n]𝒛(k)+[Ts22𝑰nTs𝑰n]𝒖(k)\bm{z}(k+1)=\begin{bmatrix}\bm{I}_{n}&T_{s}\bm{I}_{n}\\ \bm{0}_{n}&\bm{I}_{n}\end{bmatrix}\bm{z}(k)+\begin{bmatrix}\frac{T_{s}^{2}}{2}\bm{I}_{n}\\ T_{s}\bm{I}_{n}\end{bmatrix}\bm{u}(k) (6)

Based on this model, we first formulate a basic planner, which prioritizes reaching the target interception pose 𝒙tg=[𝒑tgT,ϕtg]T\bm{x}_{\text{tg}}=[\bm{p}_{\text{tg}}^{T},\bm{\phi}_{\text{tg}}]^{T} with high precision. This objective is primarily encoded in a heavily weighted terminal cost. To ensure the motion is smooth and physically feasible, the formulation also includes running costs that penalize excessive joint velocities and accelerations. This leads to the following MPC optimization problem:

min{𝒖(k),𝒛(k)}k=0N1JPT=𝒑o(N)𝒑tg𝑷e2+ϕo(N)ϕtg𝑶e2+k=0N1(𝒖(k)𝑹2+𝒒˙(k)𝑾2)\min_{\{\bm{u}(k),\bm{z}(k)\}_{k=0}^{N-1}}J_{\text{PT}}=\|\bm{p}_{o}(N)-\bm{p}_{\text{tg}}\|^{2}_{\bm{P}_{e}}+\|\bm{\phi}_{o}(N)-\bm{\phi}_{\text{tg}}\|^{2}_{\bm{O}_{e}}\\ +\sum_{k=0}^{N-1}\left(\|\bm{u}(k)\|^{2}_{\bm{R}}+\|\dot{\bm{q}}(k)\|^{2}_{\bm{W}}\right) (7)

subject to:

Eq. (6)\displaystyle\text{Eq. }(\ref{eq:mpc_ss_model})
ho(𝒒l(k))=ho(𝒒r(k))=xo\displaystyle h_{o}(\bm{q}_{l}(k))=h_{o}(\bm{q}_{r}(k))=x_{o}
𝒒min𝒒(k)𝒒max\displaystyle\bm{q}_{\min}\leq\bm{q}(k)\leq\bm{q}_{\max}
𝒒˙min𝒒˙(k)𝒒˙max\displaystyle\dot{\bm{q}}_{\min}\leq\dot{\bm{q}}(k)\leq\dot{\bm{q}}_{\max}
𝒖min𝒖(k)𝒖max\displaystyle\bm{u}_{\min}\leq\bm{u}(k)\leq\bm{u}_{\max}

where ϕo(N)ϕtg𝑶e2\|\bm{\phi}_{o}(N)-\bm{\phi}_{\text{tg}}\|^{2}_{\bm{O}_{e}} represents the weighted squared final orientation error, calculated using a suitable metric for quaternion difference. The large terminal weights 𝑷e\bm{P}_{e} and 𝑶e\bm{O}_{e} drive the optimizer to aggressively minimize the final pose error, while the running cost weights 𝑹\bm{R} and 𝑾\bm{W} regulate the control effort profile throughout the trajectory.

4.2.2 Adaptive-Terminal MPC Formulation (AT)

The preceding primitive MPC, while effective, utilizes fixed, large terminal weights that can cause aggressive accelerations, velocity overshoots, and potential power limit violations in the stiff closed-chain system. To enhance dynamic feasibility while retaining responsiveness, we propose the Adaptive-Terminal (AT) formulation featuring a cost shaping approach. AT adaptively modulates both the terminal cost weights (cycle-to-cycle) and the stage cost weights (step-by-step within the horizon based on error). This dynamic adjustment allows the planner to balance endpoint precision with trajectory smoothness, leading to the following optimization problem:

min{𝒖(k),𝒛(k)}k=0N1JAT=𝒑o(N)𝒑tg𝑷eadp2+ϕo(N)ϕtg𝑶eadp2+k=0N1(𝒑o(k)𝒑tg𝑸posadp(k)2+ϕo(k)ϕtg𝑸oriadp(k)2)+k=0N1(𝒖(k)𝑹2+𝒒˙(k)𝑾2)\!\!\!\!\!\!\!\min_{\{\bm{u}(k),\bm{z}(k)\}_{k=0}^{N-1}}\!\!J_{\text{AT}}\!=\!\|\bm{p}_{o}(N)-\bm{p}_{\text{tg}}\|^{2}_{\bm{P}_{e}^{\text{adp}}}+\|\bm{\phi}_{o}(N)-\bm{\phi}_{\text{tg}}\|^{2}_{\bm{O}_{e}^{\text{adp}}}\\ +\sum_{k=0}^{N-1}\left(\|\bm{p}_{o}(k)-\bm{p}_{\text{tg}}\|^{2}_{\bm{Q}_{pos}^{\text{adp}}(k)}+\|\bm{\phi}_{o}(k)-\bm{\phi}_{\text{tg}}\|^{2}_{\bm{Q}_{ori}^{\text{adp}}(k)}\right)\\ +\sum_{k=0}^{N-1}\left(\|\bm{u}(k)\|^{2}_{\bm{R}}+\|\dot{\bm{q}}(k)\|^{2}_{\bm{W}}\right) (8)

subject to the same constraints as in (7).

The core of the AT strategy lies in adaptively shaping the cost function weights based on the error relative to the target pose 𝒙tg=[𝒑tgT,ϕtgT]T\bm{x}_{\text{tg}}=[\bm{p}_{\text{tg}}^{T},\bm{\phi}_{\text{tg}}^{T}]^{T}. Both the terminal and stage cost weights employ a shared adaptive law defined by:

W(e;Wmin,Wmax,ϵ)=Wmin+(WmaxWmin)ee+ϵW(e;W^{\min},W^{\max},\epsilon)=W^{\min}+(W^{\max}-W^{\min})\cdot\frac{e}{e+\epsilon} (9)

where ee is the relevant error magnitude (position epos=𝒑o(k)𝒑targete_{pos}=\|\bm{p}_{o}(k)-\bm{p}_{\text{target}}\| or orientation eorie_{ori} based on quaternion difference), [Wmin,Wmax][W^{\min},W^{\max}] defines the modulation range for the specific weight, and ϵ>0\epsilon>0 is a sensitivity parameter controlling the transition steepness.

The application of this law differs for stage and terminal weights:

  • Stage Weights (𝑸posadp(k),𝑸oriadp(k)\bm{Q}_{pos}^{\text{adp}}(k),\bm{Q}_{ori}^{\text{adp}}(k)): These are updated at each prediction step kk (k=0,,N1k=0,\dots,N-1) within the horizon. Their values are computed via Eq. (9) within their respective ranges [𝑸min,𝑸max][\bm{Q}^{\min},\bm{Q}^{\max}] using the predicted error e(k)e(k) at that specific future step kk.

  • Terminal Weights (𝑷eadp,𝑶eadp{\bm{P}_{e}^{\text{adp}}},{\bm{O}_{e}^{\text{adp}}}): These are determined once at the beginning of each MPC cycle, using Eq. (9) based on the initial state error e(0)e(0) of the current cycle. The resulting values then remain constant throughout the entire prediction horizon within this cycle.

Crucially, the terminal weights operate over a significantly wider modulation range than the stage weights. This design ensures that large initial errors e(0)e(0) result in dominant terminal costs, promoting aggressive convergence. Conversely, as the system approaches the target (small e(0)e(0) and e(k)e(k)), the pronounced reduction in terminal weights increases the relative influence of the running costs (regulating 𝒖𝑹2\|\bm{u}\|^{2}_{\bm{R}}, 𝒒˙𝑾2\|\dot{\bm{q}}\|^{2}_{\bm{W}}, and stage errors via 𝑸adp(k)\bm{Q}^{\text{adp}}(k)). This shift naturally favors smoother motion during the final approach, improving feasibility.

5 Experimental evaluation

5.1 Overview

Experiments were conducted using two 7-DoF Franka manipulators holding a custom 3D-printed container, observed by an extrinsically calibrated Intel RealSense D455 camera (Fig. 4). A supplementary video is available222https://youtu.be/LJLC-HzX3G4?si=D9Glj3p1mu6e2AW4.

Refer to caption
Figure 4: The experimental setup, showcasing the dual Franka Emika Panda manipulators, the cooperative catching container, and the perception system.
Refer to caption
Figure 5: Visualization of a successful ball catch using the AT-MPC framework. ’Measured Ball Pos’ (blue markers) are raw vision system detections, filtered into the ’KF Estimate’ trajectory (orange line). The Kalman filter generates future path predictions: an early prediction potentially outside the workspace (’KF Prediction (Early/Out)’, red line) and a later ’KF Prediction (Valid)’ (green dashed line) intersecting the ’Safe Zone’ (black cube). This valid prediction determines the ’Target Pose’ (red circle) for the container center. The ’Container Trajectory’ (bold blue line) shows the executed path of the container center, starting from ’Container Start Pose’ (black circle), successfully intercepting the ball.
Refer to caption
Figure 6: Close Up view

MPC-based motion planner and control algorithms executed on a PC (AMD Ryzen Threadripper 3960X, Nvidia RTX A5000) running Ubuntu 22.04 with PREEMPT_RT patch. The system utilizes ROS 2 Humble and ros2_control for dual-arm coordination. Perception, MPC planning, and low-level PD control modules communicate via ROS 2 interfaces.

The MPC planner operates at 25 Hz (Ts=40msT_{s}=40\,\text{ms}), formulated in CasADi andersson2019casadi and solved by IPOPT nocedal2009adaptive with the MA57 solver. It outputs joint references to 1 kHz PD controllers on the hardware. Code generation feature was employed to enhance execution performance.

5.2 Experimental Results and Discussions

5.2.1 Real-World Catching Demonstration

To evaluate the performance of the overall framework on the studied system, we conducted a series of 35 consecutive real-world throws using the proposed AT-MPC framework. Out of these attempts, the system successfully intercepted 13 throws (37.1%37.1\% success rate), and 3 attempts (8.6%8.6\%) were automatically aborted due to estimated actuator power limit violations. The remaining failures were primarily attributed to external factors, such as camera perception instability or throws landing outside the predefined safe interception zone.

To demonstrate the efficacy and real-time feasibility of the proposed MPC framework, a representative successful catching trial using the AT mode is presented. Fig. 5 visualizes the trajectory generation process during this trial. Initially, predictions from the Kalman filter may fall outside the predefined safe workspace due to the limited number of early measurements. Once the predicted trajectory intersects this designated safe zone, the Point Selection Algorithm determines a feasible interception target pose ptgp_{tg}. This target is then supplied to the planner, which computes kinematically feasible joint reference trajectories {qd(k),q˙d(k)}k=0N1\{q_{d}(k),\dot{q}_{d}(k)\}_{k=0}^{N-1} for both manipulators.

The resulting motion execution is shown by the actual catcher trajectory (blue line). Fig. 6 offers a close-up view, highlighting the close spatial correspondence between the MPC-generated reference and the trajectory executed by the low-level PD controllers. Qualitatively, this minimal deviation between the planned path and the executed path in 3D space indicates effective tracking performance by the low-level PD control, validating the overall hierarchical control structure in Fig. 3. Minor observed spatial discrepancies can be primarily attributed to factors including the inherent limitations of PD control in perfectly tracking high-speed, dynamic references, potential model inaccuracies, and communication latencies within the ROS 2 framework.

Crucially, manipulator coordination and closed-chain constraint satisfaction were verified. To quantify this, we measured the Euclidean distance between the two end-effector grasping points throughout the experiments. Theoretically, this distance should remain constant at its nominal value, dnomd_{\text{nom}}, as our NMPC formulation explicitly enforces (4) required by the rigid grasp. The experimental deviation from dnomd_{\text{nom}} remained minimal throughout the maneuver, averaging only 2.8mm with a worst-case of 6.1mm. Minor fluctuations are likely attributed to contact compliance and low-level tracking synchronization effects, not a constraint violation within the planner itself.

Furthermore, real-time computational performance was validated and shown in Fig. 7. The average MPC cycle computation time was 19.21ms19.21\,\text{ms}, significantly below the sampling time TsT_{s}. The worst-case computation time (excluding potential initial overhead) also remained comfortably within this budget, demonstrating the planner’s real-time feasibility and capability for consistent plan updates during the dynamic task.

5.2.2 Comparative Analysis of MPC Modes (AT vs. PT)

During extensive real-world catching trials, a notable limitation was observed when employing the Primitive-Terminal (PT) mode in (7). In scenarios demanding agile and significant configuration adjustments, the PT planner frequently generated trajectories that triggered exceptions related to estimated power limit violations within the Franka Control Interface (FCI)333https://github.com/frankarobotics/libfranka/issues/143, leading to task abortion. This phenomenon was significantly less prevalent in trials using the AT mode. We hypothesize that the PT mode’s fixed, aggressive penalization of terminal state errors can lead the MPC to compute overly assertive joint acceleration commands q¨(k)\ddot{q}(k), particularly early in the trajectory when errors are large. As our MPC formulation is based on a kinematic model in (6) without explicit consideration of dynamic constraints (e.g., torque or power limits), these commands, when executed by the low-level controllers, may exceed hardware capabilities, especially in certain dynamically challenging configurations.

To quantitatively investigate this phenomenon and compare the intrinsic performance characteristics of the AT and PT planners in a controlled setting, we conducted a comparative analysis in simulation. This approach ensures repeatability by allowing identical throwing conditions (represented as start and target pose pairs for the planner) for both modes, while also isolating the planner’s performance from hardware noise, communication latencies, and low-level control artifacts. For this purpose, N=500N=500 trials were executed, each with randomly generated start and target poses within the workspace, ensuring that for each trial, both AT and PT modes received the exact same planning task.

Control effort, defined as the sum of squared joint accelerations over the entire duration of the planned trajectory (i.e., from start to target interception): E=k=0Ntraj𝒒¨(k)22E=\sum_{k=0}^{N_{traj}}\|\ddot{\bm{q}}(k)\|_{2}^{2}, was evaluated as a primary metric, serving as a proxy for energy consumption and the likelihood of triggering power limits. Fig. 8 presents the average control effort computed over 500 simulated trials for both the AT and PT modes, initiated from randomized start/target poses within the workspace.

The results indicate that the AT mode required significantly lower average control effort compared to the PT mode for both manipulators. This quantitative finding supports our hypothesis and real-world observations regarding power limit exceptions. The adaptive weighting mechanism inherent to the AT mode in (8) tends to produce smoother acceleration profiles by moderating commands based on the error magnitude, thereby reducing peak actuator demands compared to the fixed, aggressive terminal penalization of the PT approach in (7).

Refer to caption
Figure 7: Real-time performance: End-to-end computation times per AT-MPC cycle during a real-world experiment (cf. Fig. 5). Includes full node processing from state input to reference output for the MPC node. Average (19.21ms19.21\,\text{ms}) and worst-case times remained well below Ts=40msT_{s}=40\,\text{ms}.
Refer to caption
Figure 8: Average control effort (E=𝒒¨(k)22E=\sum\|\ddot{\bm{q}}(k)\|_{2}^{2}) for AT vs. PT modes across 500 trials. The AT mode demonstrates significantly reduced effort for both manipulators.

Fig. 9 qualitatively compare of the Cartesian X-trajectory profiles from a representative trial for the PT mode alongside AT modes tuned for aggressive, balanced, and smooth responses. The specific parameter configurations differentiating these modes are detailed in Table 1. As hypothesized, the PT mode and the aggressively tuned AT mode (approximating PT’s high terminal weights) exhibit rapid convergence but suffer from considerable overshoot. In contrast, a balanced or smoother AT weighting yields a more gradual approach with significantly reduced overshoot. Specifically, the AT Smooth configuration utilizes a near-zero adaptive terminal weight range, effectively minimizing the influence of the terminal cost terms in (8). This approximates a scenario prioritizing trajectory smoothness, primarily governed by stage costs, over strict terminal accuracy. This visually demonstrates the flexibility afforded by the AT formulation in tuning the trade-off between convergence speed and trajectory smoothness via its adaptive parameters, particularly the range [Wmin,Wmax][W_{\min},W_{\max}] governing the stage cost adaptation.

Table 1: Parameter configurations differentiating the MPC modes compared in Fig. 9. PT uses fixed terminal weights. AT variants differ in their adaptive terminal weight ranges. All AT variants share the same adaptive stage weight ranges: Position (Qpos,maxadp,Qpos,minadp)=(2.0,0.1)(Q_{pos,\text{max}}^{adp},Q_{pos,\text{min}}^{adp})=(2.0,0.1) and Orientation (Qori,maxadp,Qori,minadp)=(1.0,0.1)(Q_{ori,\text{max}}^{adp},Q_{ori,\text{min}}^{adp})=(1.0,0.1).
MPC Mode Parameters
PT Pe=500P_{e}=500, Oe=10O_{e}=10
AT Aggressive
(Pe,maxadp,Pe,minadp)=(500,100)(P_{e,\text{max}}^{adp},P_{e,\text{min}}^{adp})=(500,100)
(Oe,maxadp,Oe,minadp)=(10,2)(O_{e,\text{max}}^{adp},O_{e,\text{min}}^{adp})=(10,2)
AT Balanced
(Pe,maxadp,Pe,minadp)=(50,20)(P_{e,\text{max}}^{adp},P_{e,\text{min}}^{adp})=(50,20)
(Oe,maxadp,Oe,minadp)=(10,2)(O_{e,\text{max}}^{adp},O_{e,\text{min}}^{adp})=(10,2)
AT Smooth
(Pe,maxadp,Pe,minadp)=(0.001,0)(P_{e,\text{max}}^{adp},P_{e,\text{min}}^{adp})=(0.001,0)
(Oe,maxadp,Oe,minadp)=(0.001,0)(O_{e,\text{max}}^{adp},O_{e,\text{min}}^{adp})=(0.001,0)

Finally, the computational efficiency of both modes was compared over the 500 simulation trials (Fig. 10). The AT formulation’s adaptive logic introduced minimal computational overhead; its average time was only marginally higher than the PT’s, and the worst-case times were comparable (39ms39\,\text{ms} vs 38ms38\,\text{ms}) . As illustrated in Fig. 11, the AT mode also demonstrated high consistency, with both its average and worst-case cycle times remaining well below the Ts=40msT_{s}=40\,\text{ms} across all 500 scenarios.

These simulation results confirm that both modes meet the real-time constraints. Crucially, they demonstrate that the additional complexity of the AT mode introduces minimal computational overhead, reinforcing its practical applicability as demonstrated in the physical experiments in Section 5.2.1.

Refer to caption
Figure 9: Comparison of Cartesian X-axis trajectory profiles for PT and differently tuned AT modes (Aggressive, Balanced, Smooth) in a representative simulation. Parameters are detailed in Table 1.
Refer to caption
Figure 10: Computation time statistics over 500 trials. The bars display average (left) and worst-case (right) MPC cycle times for AT and PT modes. All times remain well below the 40ms40\,\text{ms} sampling limit.
Refer to caption
Figure 11: Computation time distributions for AT (Balanced) mode across 500 trials. Top/Bottom: Per-trial average (blue) and worst-case (green) cycle times with aggregate box plots. All instances remain within the 40ms40\,\text{ms} real-time limit (dashed line).

6 CONCLUSIONS

This paper presented and experimentally validated an MPC-based motion planner, demonstrating the effectiveness of an Adaptive-Terminal (AT) formulation for cooperative ball catching by two manipulators under closed-chain constraints. Compared to a baseline Primitive-Terminal (PT) approach, the AT strategy significantly reduced control effort and mitigated actuator power limit violations, yielding smoother trajectories with minimal computational overhead, confirmed by average cycle times well below the 40ms40\,\text{ms} sampling period. Future work will focus on enhancing robustness and success rates through a learning-augmented hierarchical framework, training policies to predict optimal interception poses directly from observations, aiming for improved overall system performance in dynamic cooperative tasks.