Software framework
This page describes how the stack runs end-to-end: the
ros2_control cycle (50 Hz on real hardware, 200 Hz in MuJoCo), the
five-mode finite state machine that arbitrates which controller is
active, the in-process vs. out-of-process policy tiers, the safety /
fallback model, and the two-machine deployment topology that
decides which of those processes lives on the robot vs. on the
operator workstation.
Module dependency overview
Before diving into the runtime, here's the static picture — which packages build against which:
Notice that bar_controllers does not find_package(bar_robstride).
The plugin is loaded by controller_manager at launch via pluginlib — a
runtime dep that doesn't appear in the static graph but is just as binding.
The same applies to every <plugin> entry in a controller-manager YAML.
The ros2_control cycle
ros2_control is the integration spine. controller_manager owns the
real-time loop. Every tick — 50 Hz on Lite real hardware, 200 Hz in
MuJoCo — it performs three steps:
Constraints inside each phase:
| Phase | What's allowed | What's forbidden |
|---|---|---|
read() | swap lock-free buffer pointers, copy small POD | syscalls, allocations, DDS waits |
update() | read state_interfaces_, write command_interfaces_, lock-free trylock for diag publishers | allocations, blocking, exceptions across the RT boundary |
write() | stage frames into the bus library's outgoing queue | the actual CAN/EtherCAT syscall (that's the I/O thread's job) |
The I/O thread in each hardware plugin (bar_socketcan::SocketCanBus,
ethercat_driver_ros2's EtherLAB master thread) is separate from the
controller-manager thread. RT-safety is preserved by making read() /
write() allocation-free buffer swaps.
Where the data actually lives
Zooming in on one tick — the path a CAN frame takes from the kernel into a
controller's update() and back:
The dashed red line is the RT boundary. Anything that crosses it goes
through a lock-free SPSC ring — the RT thread can read or stage
frames without ever touching the kernel socket directly. The I/O
thread does the blocking epoll_wait and decodes/encodes frames on
its own pace.
Calibration (direction, homing_offset) is applied inside the
plugin's read() and write(), so every controller above sees joint
frame, never the raw encoder. See Calibration math
for the formula.
The torque computation
tau = K_p (q_cmd - q) + K_d (dot q_cmd - dot q) + tau_ff
runs on the Robstride motor firmware (real hardware) or on MuJoCo's
qfrc_applied step (sim). The controller just writes five numbers per joint
per tick. This is the same factoring used by MIT Cheetah / Mini Cheetah and
by Berkeley's earlier Humanoid-Control deployment.
Five-mode finite state machine
The whole control surface boils down to one active controller at a time,
selected by mode_manager. joint_state_broadcaster runs alongside as the
always-on state stream.
Behavior per state:
| State | Plugin | What it writes |
|---|---|---|
| ZERO_TORQUE | bar/ZeroTorqueController | 0 to all 5 cmd interfaces. Startup default, fault fallback. |
| DAMPING | bar/DampingController | K=0, D=damping value, q_cmd=q_captured — soft under gravity, resists velocity. |
| STANDBY | bar/StandbyController | Linear pose interpolation through a YAML sequence; ramps K_p / K_d on first segment. Publishes StandbyState with is_finished. |
| LOCOMOTION | bar/RLPolicyController | In-process ONNX inference; YAML-driven obs packing + action mapping. |
| REMOTE | bar/RemotePolicyController | Thin executor for an out-of-process Python policy; subscribes ~/command (MITCommand) with stale-command gating. |
Transition mechanics
Every transition is one switch_controller service call to the
controller_manager (STRICT strictness, async). The mode_manager node is a
plain rclcpp::Node that subscribes:
/joy(gamepad intents; on by default — bringup hard-fails if/dev/input/js*is missing unless you opt out withenable_gamepad:=false)/standby_controller/state(theis_finishedgate for the twoSTART_*intents)/safety_status(the auto-DAMP trigger)
…and exposes five std_srvs/Trigger services so transitions can also be
driven from the command line:
/bar/mode/damp,/bar/mode/load,/bar/mode/start_remote,/bar/mode/start_locomotion,/bar/mode/quit
/control_mode is published at 50 Hz. The manager polls
list_controllers periodically (every 25 ticks = 500 ms) so controllers
loaded after the first poll become visible to dispatch_intent without
the operator having to re-trigger.
Two parallel policy tiers
The "active policy" mode comes in two flavors that share the exact same observation/action contract:
Key property: the Python ObservationManager mirrors the C++ one
structurally — same term names (JointPositionTerm, JointVelocityTerm,
LastActionTerm, ImuFieldTerm, ReferenceProviderTerm), same scaling
convention out = (q - q_default) * scale, same flat-ndarray
observation contract. A policy debugged in Python can be promoted to C++
without observation-indexing drift.
The Python ObservationManager also exposes two lifecycle hooks that
the C++ side will eventually mirror:
reset()— clear per-term state (e.g.LastActionTermzero-init) and rewind any attached reference provider. Called on controller activation.record_action(action)— refreshLastActionTermwith the action the runner just emitted, then step the reference provider so the dataset frame advances exactly once per policy tick.
Today's out-of-process tier is the real ONNX path: bar_policy's
remote_policy_runner loads an ONNX checkpoint, parses self-describing
metadata baked into the file (joint names, gains, default pose, dataset
pointer), and publishes MITCommand to RemotePolicyController. The
piano task ships its own subclass (pianist_policy.PianoPolicyRunner)
that adds a PianoSongTermProvider for the song-goal lookahead. The
in-process RLPolicyController is still wired to a ConstantHoldPolicy
stub — same metadata schema will land in C++ when locomotion graduates
to in-process inference. See Policy runner.
MITState itself is a code-level schema (a bar::MITState POD in
C++, a matching @dataclass in Python). It is not a published topic —
observations are assembled in-process from /lite/joint_states (the
always-on broadcaster) and /imu/data (the IMU driver).
Frozen schemas
A handful of artifacts are frozen once a trained policy depends on them:
| Artifact | Frozen because |
|---|---|
bar_msgs/MITCommand | trained policies emit this field-by-field over DDS |
Joint order in bar_*_controllers.yaml | trained policies index into this order |
MITState struct + Python dataclass | both sides agree on joint_position/joint_velocity/IMU layout |
| Observation term scale + default vectors | shifts mean retraining |
Once a policy ships to a piano-playing or locomotion run, changing any of these forces retraining. Keep this in mind when refactoring.
Safety and fault handling
Safety is layered — no single ROS node is treated as the whole safety system:
Concrete examples:
- A Robstride bus-off →
bar_robstridepublishesSafetyStatus{level=FAULT, source="bar_robstride/can0", flags=BUS_OFF}→mode_managerrequests a STRICT switch to DAMPING. If DAMPING fails (e.g. command interfaces unavailable),mode_managerfalls back to ZERO_TORQUE. - A
RemotePolicyControllerwhose Python publisher stalls for >100 ms (stale_command_timeout_msdefault) writes passive commands (zero stiffness/damping) by default, or zero-order-holds the last command ifstale_command_policy: holdis set. Staleness is measured against arrival time at the subscription callback, not againstMITCommand.header.stamp, so publisher clock skew is irrelevant. - An RL policy returning NaN in its action vector →
RLPolicyControllerdetects viabar::rt::all_finite(...)and returnsreturn_type::ERROR, triggeringfallback_controllersin the CM YAML.
Deployment topology
The shipping configuration is a two-machine tethered split. The
same colcon workspace is installed (and built from the same pixi lock
file) on both machines; each launch boots only the subset of nodes
that belongs on its side. Single-machine sim/dev paths
(bar_bringup_lite/mujoco.launch.py, bar_description_lite/view_lite.launch.py,
bar_bringup_lite/calibrate.launch.py) are unaffected — they
collapse both sides into one process tree.
Launches come from two sibling repos: bar_ros2 ships every
Lite/Prime control-plane and tracking-policy launch; pianist_ros2
ships the piano-task-specific launches.
| Side | Machine | Launch | What lives here |
|---|---|---|---|
| Robot | Onboard computer (RT kernel, wired tether) | bar_bringup_lite/launch/real.launch.py (bar_ros2) | ros2_control_node, bar_robstride / bar_sito hardware plugins, joint_state_broadcaster, the five FSM controllers (zero_torque / damping / standby / rl_policy / remote_policy), mode_manager, joy_node, robot_state_publisher, IMU driver |
| Host | Operator workstation | bar_bringup_lite/launch/viz.launch.py (bar_ros2) | viser_viz or rerun_viz (selected by viewer:=) |
| Host | Operator workstation | bar_policy/launch/lite_policy.launch.py (bar_ros2) | Tracking-family ONNX runner (onnxruntime, W&B / HF Hub deps); task:=tracking|reach|residual |
| Host | Operator workstation | pianist_policy/launch/piano_policy.launch.py (pianist_ros2) | Piano-task ONNX runner (piano_policy_runner, a remote_policy_runner subclass) |
| Host | Operator workstation | pianist_policy/launch/midi_keyboard_driver.launch.py (pianist_ros2) | Real USB-MIDI keyboard driver (publishes /piano/key_state to the local-host piano runner — loopback, does not cross the tether) |
What crosses the tether
Only DDS topics, never controller-manager service calls.
| Topic | Direction | QoS | Rate / size | Notes |
|---|---|---|---|---|
/robot_description | robot → host | RELIABLE + TRANSIENT_LOCAL | ~kB, latched | URDF tree (no meshes — host has its own install share). |
/lite/joint_states | robot → host | RELIABLE | 50 Hz, ~14 floats × 3 | Viewer + policy runner input. |
/imu/data | robot → host | RELIABLE | sensor-rate | Policy runner observation input. |
/control_mode | robot → host | RELIABLE | 50 Hz | FSM telemetry for operator dashboards. |
/remote_policy_controller/command | host → robot | RELIABLE depth 4 | 50 Hz, ~280 B | The single RT-adjacent host→robot stream. RemotePolicyController uses arrival-time staleness, not header.stamp. |
/tf | robot → host | RELIABLE | 50 Hz | RSP fanout — viewers consume. |
/joy (gamepad) and /safety_status are intentionally onboard-only:
both go straight into mode_manager (loopback) so the safety path
never depends on the tether. /piano/key_state is host-local
(MIDI driver → piano runner), and /piano/target_keys is the
runner's own re-publish, also host-local — neither crosses the
tether.
Why this split (the three judgment calls)
- Gamepad on the robot.
DAMPandQUITare the operator's safety affordance. Routing/joyover DDS across the tether means a flaky link can suppress an e-stop. Every legged-RL deployment this project mirrors (legged_control2,instinct_onboard, the earlier Humanoid-Control stack) keeps the gamepad onboard. Use a USB extension or a wireless dongle plugged into the onboard computer, not into the host laptop. mode_manageron the robot. It calls/controller_manager/switch_controller(a service local to CM) and consumes/safety_statusfrom the per-bus hardware plugins. Placing it onboard makes switch-controller, safety auto-DAMP, and/joyconsumption all loopback — zero cross-machine latency in the safety path.robot_state_publisheron the robot. RSP is a pure transform fanout. Putting it onboard means/robot_description(latched) and/tforiginate at one address; host-side viewers subscribe over the wire. Bandwidth is small (kinematic tree, not point clouds).
Network assumptions
- Wired Ethernet tether only. WiFi as the operator-to-robot link is explicitly not supported; if the requirement appears, gamepad-on-robot is doubly justified and DDS QoS needs separate tuning.
- Both machines run the same
ROS_DOMAIN_IDand (recommended) the sameRMW_IMPLEMENTATION. Cyclone DDS is the recommended pick; pin the network interface in the XML config so discovery doesn't leak onto management NICs.
AGENTS.md §"Deployment topology" carries the canonical version of this split (with an ASCII process diagram); this page reflects the deployed surface.
Next
mode_managersource — the FSM is ~150 lines of C++; readable in one sitting.- Lite 101 — see all of this run end-to-end against mock hardware and MuJoCo.
- Controllers reference — per-controller parameter tables.