Skip to main content

Software framework

This page describes how the stack runs end-to-end: the ros2_control cycle (50 Hz on real hardware, 200 Hz in MuJoCo), the five-mode finite state machine that arbitrates which controller is active, the in-process vs. out-of-process policy tiers, the safety / fallback model, and the two-machine deployment topology that decides which of those processes lives on the robot vs. on the operator workstation.

Module dependency overview

Before diving into the runtime, here's the static picture — which packages build against which:

Module dependency graph

Notice that bar_controllers does not find_package(bar_robstride). The plugin is loaded by controller_manager at launch via pluginlib — a runtime dep that doesn't appear in the static graph but is just as binding. The same applies to every <plugin> entry in a controller-manager YAML.

The ros2_control cycle

ros2_control is the integration spine. controller_manager owns the real-time loop. Every tick — 50 Hz on Lite real hardware, 200 Hz in MuJoCo — it performs three steps:

ros2_control RT cycle

Constraints inside each phase:

PhaseWhat's allowedWhat's forbidden
read()swap lock-free buffer pointers, copy small PODsyscalls, allocations, DDS waits
update()read state_interfaces_, write command_interfaces_, lock-free trylock for diag publishersallocations, blocking, exceptions across the RT boundary
write()stage frames into the bus library's outgoing queuethe actual CAN/EtherCAT syscall (that's the I/O thread's job)

The I/O thread in each hardware plugin (bar_socketcan::SocketCanBus, ethercat_driver_ros2's EtherLAB master thread) is separate from the controller-manager thread. RT-safety is preserved by making read() / write() allocation-free buffer swaps.

Where the data actually lives

Zooming in on one tick — the path a CAN frame takes from the kernel into a controller's update() and back:

RT data pipeline

The dashed red line is the RT boundary. Anything that crosses it goes through a lock-free SPSC ring — the RT thread can read or stage frames without ever touching the kernel socket directly. The I/O thread does the blocking epoll_wait and decodes/encodes frames on its own pace.

Calibration (direction, homing_offset) is applied inside the plugin's read() and write(), so every controller above sees joint frame, never the raw encoder. See Calibration math for the formula.

Why MIT-mode lives in the hardware plugin, not the controller

The torque computation tau = K_p (q_cmd - q) + K_d (dot q_cmd - dot q) + tau_ff runs on the Robstride motor firmware (real hardware) or on MuJoCo's qfrc_applied step (sim). The controller just writes five numbers per joint per tick. This is the same factoring used by MIT Cheetah / Mini Cheetah and by Berkeley's earlier Humanoid-Control deployment.

Five-mode finite state machine

The whole control surface boils down to one active controller at a time, selected by mode_manager. joint_state_broadcaster runs alongside as the always-on state stream.

Five-mode FSM

Behavior per state:

StatePluginWhat it writes
ZERO_TORQUEbar/ZeroTorqueController0 to all 5 cmd interfaces. Startup default, fault fallback.
DAMPINGbar/DampingControllerK=0, D=damping value, q_cmd=q_captured — soft under gravity, resists velocity.
STANDBYbar/StandbyControllerLinear pose interpolation through a YAML sequence; ramps K_p / K_d on first segment. Publishes StandbyState with is_finished.
LOCOMOTIONbar/RLPolicyControllerIn-process ONNX inference; YAML-driven obs packing + action mapping.
REMOTEbar/RemotePolicyControllerThin executor for an out-of-process Python policy; subscribes ~/command (MITCommand) with stale-command gating.

Transition mechanics

Every transition is one switch_controller service call to the controller_manager (STRICT strictness, async). The mode_manager node is a plain rclcpp::Node that subscribes:

  • /joy (gamepad intents; on by default — bringup hard-fails if /dev/input/js* is missing unless you opt out with enable_gamepad:=false)
  • /standby_controller/state (the is_finished gate for the two START_* intents)
  • /safety_status (the auto-DAMP trigger)

…and exposes five std_srvs/Trigger services so transitions can also be driven from the command line:

  • /bar/mode/damp, /bar/mode/load, /bar/mode/start_remote, /bar/mode/start_locomotion, /bar/mode/quit

/control_mode is published at 50 Hz. The manager polls list_controllers periodically (every 25 ticks = 500 ms) so controllers loaded after the first poll become visible to dispatch_intent without the operator having to re-trigger.

Two parallel policy tiers

The "active policy" mode comes in two flavors that share the exact same observation/action contract:

Two policy tiers

Key property: the Python ObservationManager mirrors the C++ one structurally — same term names (JointPositionTerm, JointVelocityTerm, LastActionTerm, ImuFieldTerm, ReferenceProviderTerm), same scaling convention out = (q - q_default) * scale, same flat-ndarray observation contract. A policy debugged in Python can be promoted to C++ without observation-indexing drift.

The Python ObservationManager also exposes two lifecycle hooks that the C++ side will eventually mirror:

  • reset() — clear per-term state (e.g. LastActionTerm zero-init) and rewind any attached reference provider. Called on controller activation.
  • record_action(action) — refresh LastActionTerm with the action the runner just emitted, then step the reference provider so the dataset frame advances exactly once per policy tick.

Today's out-of-process tier is the real ONNX path: bar_policy's remote_policy_runner loads an ONNX checkpoint, parses self-describing metadata baked into the file (joint names, gains, default pose, dataset pointer), and publishes MITCommand to RemotePolicyController. The piano task ships its own subclass (pianist_policy.PianoPolicyRunner) that adds a PianoSongTermProvider for the song-goal lookahead. The in-process RLPolicyController is still wired to a ConstantHoldPolicy stub — same metadata schema will land in C++ when locomotion graduates to in-process inference. See Policy runner.

MITState itself is a code-level schema (a bar::MITState POD in C++, a matching @dataclass in Python). It is not a published topic — observations are assembled in-process from /lite/joint_states (the always-on broadcaster) and /imu/data (the IMU driver).

Frozen schemas

A handful of artifacts are frozen once a trained policy depends on them:

ArtifactFrozen because
bar_msgs/MITCommandtrained policies emit this field-by-field over DDS
Joint order in bar_*_controllers.yamltrained policies index into this order
MITState struct + Python dataclassboth sides agree on joint_position/joint_velocity/IMU layout
Observation term scale + default vectorsshifts mean retraining

Once a policy ships to a piano-playing or locomotion run, changing any of these forces retraining. Keep this in mind when refactoring.

Safety and fault handling

Safety is layered — no single ROS node is treated as the whole safety system:

Concrete examples:

  • A Robstride bus-off → bar_robstride publishes SafetyStatus{level=FAULT, source="bar_robstride/can0", flags=BUS_OFF}mode_manager requests a STRICT switch to DAMPING. If DAMPING fails (e.g. command interfaces unavailable), mode_manager falls back to ZERO_TORQUE.
  • A RemotePolicyController whose Python publisher stalls for >100 ms (stale_command_timeout_ms default) writes passive commands (zero stiffness/damping) by default, or zero-order-holds the last command if stale_command_policy: hold is set. Staleness is measured against arrival time at the subscription callback, not against MITCommand.header.stamp, so publisher clock skew is irrelevant.
  • An RL policy returning NaN in its action vector → RLPolicyController detects via bar::rt::all_finite(...) and returns return_type::ERROR, triggering fallback_controllers in the CM YAML.

Deployment topology

The shipping configuration is a two-machine tethered split. The same colcon workspace is installed (and built from the same pixi lock file) on both machines; each launch boots only the subset of nodes that belongs on its side. Single-machine sim/dev paths (bar_bringup_lite/mujoco.launch.py, bar_description_lite/view_lite.launch.py, bar_bringup_lite/calibrate.launch.py) are unaffected — they collapse both sides into one process tree.

Launches come from two sibling repos: bar_ros2 ships every Lite/Prime control-plane and tracking-policy launch; pianist_ros2 ships the piano-task-specific launches.

SideMachineLaunchWhat lives here
RobotOnboard computer (RT kernel, wired tether)bar_bringup_lite/launch/real.launch.py (bar_ros2)ros2_control_node, bar_robstride / bar_sito hardware plugins, joint_state_broadcaster, the five FSM controllers (zero_torque / damping / standby / rl_policy / remote_policy), mode_manager, joy_node, robot_state_publisher, IMU driver
HostOperator workstationbar_bringup_lite/launch/viz.launch.py (bar_ros2)viser_viz or rerun_viz (selected by viewer:=)
HostOperator workstationbar_policy/launch/lite_policy.launch.py (bar_ros2)Tracking-family ONNX runner (onnxruntime, W&B / HF Hub deps); task:=tracking|reach|residual
HostOperator workstationpianist_policy/launch/piano_policy.launch.py (pianist_ros2)Piano-task ONNX runner (piano_policy_runner, a remote_policy_runner subclass)
HostOperator workstationpianist_policy/launch/midi_keyboard_driver.launch.py (pianist_ros2)Real USB-MIDI keyboard driver (publishes /piano/key_state to the local-host piano runner — loopback, does not cross the tether)

What crosses the tether

Only DDS topics, never controller-manager service calls.

TopicDirectionQoSRate / sizeNotes
/robot_descriptionrobot → hostRELIABLE + TRANSIENT_LOCAL~kB, latchedURDF tree (no meshes — host has its own install share).
/lite/joint_statesrobot → hostRELIABLE50 Hz, ~14 floats × 3Viewer + policy runner input.
/imu/datarobot → hostRELIABLEsensor-ratePolicy runner observation input.
/control_moderobot → hostRELIABLE50 HzFSM telemetry for operator dashboards.
/remote_policy_controller/commandhost → robotRELIABLE depth 450 Hz, ~280 BThe single RT-adjacent host→robot stream. RemotePolicyController uses arrival-time staleness, not header.stamp.
/tfrobot → hostRELIABLE50 HzRSP fanout — viewers consume.

/joy (gamepad) and /safety_status are intentionally onboard-only: both go straight into mode_manager (loopback) so the safety path never depends on the tether. /piano/key_state is host-local (MIDI driver → piano runner), and /piano/target_keys is the runner's own re-publish, also host-local — neither crosses the tether.

Why this split (the three judgment calls)

  • Gamepad on the robot. DAMP and QUIT are the operator's safety affordance. Routing /joy over DDS across the tether means a flaky link can suppress an e-stop. Every legged-RL deployment this project mirrors (legged_control2, instinct_onboard, the earlier Humanoid-Control stack) keeps the gamepad onboard. Use a USB extension or a wireless dongle plugged into the onboard computer, not into the host laptop.
  • mode_manager on the robot. It calls /controller_manager/switch_controller (a service local to CM) and consumes /safety_status from the per-bus hardware plugins. Placing it onboard makes switch-controller, safety auto-DAMP, and /joy consumption all loopback — zero cross-machine latency in the safety path.
  • robot_state_publisher on the robot. RSP is a pure transform fanout. Putting it onboard means /robot_description (latched) and /tf originate at one address; host-side viewers subscribe over the wire. Bandwidth is small (kinematic tree, not point clouds).

Network assumptions

  • Wired Ethernet tether only. WiFi as the operator-to-robot link is explicitly not supported; if the requirement appears, gamepad-on-robot is doubly justified and DDS QoS needs separate tuning.
  • Both machines run the same ROS_DOMAIN_ID and (recommended) the same RMW_IMPLEMENTATION. Cyclone DDS is the recommended pick; pin the network interface in the XML config so discovery doesn't leak onto management NICs.

AGENTS.md §"Deployment topology" carries the canonical version of this split (with an ASCII process diagram); this page reflects the deployed surface.

Next