Software framework

This page describes how the stack runs end-to-end: the ros2_control cycle (50 Hz on real hardware, 200 Hz in MuJoCo), the five-mode finite state machine that arbitrates which controller is active, the in-process vs. out-of-process policy tiers, the safety / fallback model, and the two-machine deployment topology that decides which of those processes lives on the robot vs. on the operator workstation.

Module dependency overview

Before diving into the runtime, here's the static picture — which packages build against which:

Module dependency graph

Notice that bar_controllers does not find_package(bar_robstride). The plugin is loaded by controller_manager at launch via pluginlib — a runtime dep that doesn't appear in the static graph but is just as binding. The same applies to every <plugin> entry in a controller-manager YAML.

The ros2_control cycle

ros2_control is the integration spine. controller_manager owns the real-time loop. Every tick — 50 Hz on Lite real hardware, 200 Hz in MuJoCo — it performs three steps:

ros2_control RT cycle

Constraints inside each phase:

Phase	What's allowed	What's forbidden
`read()`	swap lock-free buffer pointers, copy small POD	syscalls, allocations, DDS waits
`update()`	read state_interfaces_, write command_interfaces_, lock-free trylock for diag publishers	allocations, blocking, exceptions across the RT boundary
`write()`	stage frames into the bus library's outgoing queue	the actual CAN/EtherCAT syscall (that's the I/O thread's job)

The I/O thread in each hardware plugin (bar_socketcan::SocketCanBus, ethercat_driver_ros2's EtherLAB master thread) is separate from the controller-manager thread. RT-safety is preserved by making read() / write() allocation-free buffer swaps.

Where the data actually lives

Zooming in on one tick — the path a CAN frame takes from the kernel into a controller's update() and back:

RT data pipeline

The dashed red line is the RT boundary. Anything that crosses it goes through a lock-free SPSC ring — the RT thread can read or stage frames without ever touching the kernel socket directly. The I/O thread does the blocking epoll_wait and decodes/encodes frames on its own pace.

Calibration (direction, homing_offset) is applied inside the plugin's read() and write(), so every controller above sees joint frame, never the raw encoder. See Calibration math for the formula.

Why MIT-mode lives in the hardware plugin, not the controller

The torque computation tau = K_p (q_cmd - q) + K_d (dot q_cmd - dot q) + tau_ff runs on the Robstride motor firmware (real hardware) or on MuJoCo's qfrc_applied step (sim). The controller just writes five numbers per joint per tick. This is the same factoring used by MIT Cheetah / Mini Cheetah and by Berkeley's earlier Humanoid-Control deployment.

Five-mode finite state machine

The whole control surface boils down to one active controller at a time, selected by mode_manager. joint_state_broadcaster runs alongside as the always-on state stream.

Five-mode FSM

Behavior per state:

State	Plugin	What it writes
ZERO_TORQUE	`bar/ZeroTorqueController`	0 to all 5 cmd interfaces. Startup default, fault fallback.
DAMPING	`bar/DampingController`	`K=0`, `D=damping value`, `q_cmd=q_captured` — soft under gravity, resists velocity.
STANDBY	`bar/StandbyController`	Linear pose interpolation through a YAML sequence; ramps `K_p / K_d` on first segment. Publishes `StandbyState` with `is_finished`.
LOCOMOTION	`bar/RLPolicyController`	In-process ONNX inference; YAML-driven obs packing + action mapping.
REMOTE	`bar/RemotePolicyController`	Thin executor for an out-of-process Python policy; subscribes `~/command` (`MITCommand`) with stale-command gating.

Transition mechanics

Every transition is one switch_controller service call to the controller_manager (STRICT strictness, async). The mode_manager node is a plain rclcpp::Node that subscribes:

/joy (gamepad intents; on by default — bringup hard-fails if /dev/input/js* is missing unless you opt out with enable_gamepad:=false)
/standby_controller/state (the is_finished gate for the two START_* intents)
/safety_status (the auto-DAMP trigger)

…and exposes five std_srvs/Trigger services so transitions can also be driven from the command line:

/bar/mode/damp, /bar/mode/load, /bar/mode/start_remote, /bar/mode/start_locomotion, /bar/mode/quit

/control_mode is published at 50 Hz. The manager polls list_controllers periodically (every 25 ticks = 500 ms) so controllers loaded after the first poll become visible to dispatch_intent without the operator having to re-trigger.

Two parallel policy tiers

The "active policy" mode comes in two flavors that share the exact same observation/action contract:

Two policy tiers

Key property: the Python ObservationManager mirrors the C++ one structurally — same term names (JointPositionTerm, JointVelocityTerm, LastActionTerm, ImuFieldTerm, ReferenceProviderTerm), same scaling convention out = (q - q_default) * scale, same flat-ndarray observation contract. A policy debugged in Python can be promoted to C++ without observation-indexing drift.

The Python ObservationManager also exposes two lifecycle hooks that the C++ side will eventually mirror:

reset() — clear per-term state (e.g. LastActionTerm zero-init) and rewind any attached reference provider. Called on controller activation.
record_action(action) — refresh LastActionTerm with the action the runner just emitted, then step the reference provider so the dataset frame advances exactly once per policy tick.

Today's out-of-process tier is the real ONNX path: bar_policy's remote_policy_runner loads an ONNX checkpoint, parses self-describing metadata baked into the file (joint names, gains, default pose, dataset pointer), and publishes MITCommand to RemotePolicyController. The piano task ships its own subclass (pianist_policy.PianoPolicyRunner) that adds a PianoSongTermProvider for the song-goal lookahead. The in-process RLPolicyController is still wired to a ConstantHoldPolicy stub — same metadata schema will land in C++ when locomotion graduates to in-process inference. See Policy runner.

MITState itself is a code-level schema (a bar::MITState POD in C++, a matching @dataclass in Python). It is not a published topic — observations are assembled in-process from /lite/joint_states (the always-on broadcaster) and /imu/data (the IMU driver).

Frozen schemas

A handful of artifacts are frozen once a trained policy depends on them:

Artifact	Frozen because
`bar_msgs/MITCommand`	trained policies emit this field-by-field over DDS
Joint order in `bar_*_controllers.yaml`	trained policies index into this order
`MITState` struct + Python dataclass	both sides agree on `joint_position`/`joint_velocity`/IMU layout
Observation term scale + default vectors	shifts mean retraining

Once a policy ships to a piano-playing or locomotion run, changing any of these forces retraining. Keep this in mind when refactoring.

Safety and fault handling

Safety is layered — no single ROS node is treated as the whole safety system:

Concrete examples:

A Robstride bus-off → bar_robstride publishes SafetyStatus{level=FAULT, source="bar_robstride/can0", flags=BUS_OFF} → mode_manager requests a STRICT switch to DAMPING. If DAMPING fails (e.g. command interfaces unavailable), mode_manager falls back to ZERO_TORQUE.
A RemotePolicyController whose Python publisher stalls for >100 ms (stale_command_timeout_ms default) writes passive commands (zero stiffness/damping) by default, or zero-order-holds the last command if stale_command_policy: hold is set. Staleness is measured against arrival time at the subscription callback, not against MITCommand.header.stamp, so publisher clock skew is irrelevant.
An RL policy returning NaN in its action vector → RLPolicyController detects via bar::rt::all_finite(...) and returns return_type::ERROR, triggering fallback_controllers in the CM YAML.

Deployment topology

The shipping configuration is a two-machine tethered split. The same colcon workspace is installed (and built from the same pixi lock file) on both machines; each launch boots only the subset of nodes that belongs on its side. Single-machine sim/dev paths (bar_bringup_lite/mujoco.launch.py, bar_description_lite/view_lite.launch.py, bar_bringup_lite/calibrate.launch.py) are unaffected — they collapse both sides into one process tree.

Launches come from two sibling repos: bar_ros2 ships every Lite/Prime control-plane and tracking-policy launch; pianist_ros2 ships the piano-task-specific launches.

Side	Machine	Launch	What lives here
Robot	Onboard computer (RT kernel, wired tether)	`bar_bringup_lite/launch/real.launch.py` (bar_ros2)	`ros2_control_node`, `bar_robstride` / `bar_sito` hardware plugins, `joint_state_broadcaster`, the five FSM controllers (`zero_torque` / `damping` / `standby` / `rl_policy` / `remote_policy`), `mode_manager`, `joy_node`, `robot_state_publisher`, IMU driver
Host	Operator workstation	`bar_bringup_lite/launch/viz.launch.py` (bar_ros2)	`viser_viz` or `rerun_viz` (selected by `viewer:=`)
Host	Operator workstation	`bar_policy/launch/lite_policy.launch.py` (bar_ros2)	Tracking-family ONNX runner (`onnxruntime`, W&B / HF Hub deps); `task:=tracking\|reach\|residual`
Host	Operator workstation	`pianist_policy/launch/piano_policy.launch.py` (pianist_ros2)	Piano-task ONNX runner (`piano_policy_runner`, a `remote_policy_runner` subclass)
Host	Operator workstation	`pianist_policy/launch/midi_keyboard_driver.launch.py` (pianist_ros2)	Real USB-MIDI keyboard driver (publishes `/piano/key_state` to the local-host piano runner — loopback, does not cross the tether)

What crosses the tether

Only DDS topics, never controller-manager service calls.

Topic	Direction	QoS	Rate / size	Notes
`/robot_description`	robot → host	RELIABLE + TRANSIENT_LOCAL	~kB, latched	URDF tree (no meshes — host has its own install share).
`/lite/joint_states`	robot → host	RELIABLE	50 Hz, ~14 floats × 3	Viewer + policy runner input.
`/imu/data`	robot → host	RELIABLE	sensor-rate	Policy runner observation input.
`/control_mode`	robot → host	RELIABLE	50 Hz	FSM telemetry for operator dashboards.
`/remote_policy_controller/command`	host → robot	RELIABLE depth 4	50 Hz, ~280 B	The single RT-adjacent host→robot stream. `RemotePolicyController` uses arrival-time staleness, not header.stamp.
`/tf`	robot → host	RELIABLE	50 Hz	RSP fanout — viewers consume.

/joy (gamepad) and /safety_status are intentionally onboard-only: both go straight into mode_manager (loopback) so the safety path never depends on the tether. /piano/key_state is host-local (MIDI driver → piano runner), and /piano/target_keys is the runner's own re-publish, also host-local — neither crosses the tether.

Why this split (the three judgment calls)

Gamepad on the robot. DAMP and QUIT are the operator's safety affordance. Routing /joy over DDS across the tether means a flaky link can suppress an e-stop. Every legged-RL deployment this project mirrors (legged_control2, instinct_onboard, the earlier Humanoid-Control stack) keeps the gamepad onboard. Use a USB extension or a wireless dongle plugged into the onboard computer, not into the host laptop.
mode_manager on the robot. It calls /controller_manager/switch_controller (a service local to CM) and consumes /safety_status from the per-bus hardware plugins. Placing it onboard makes switch-controller, safety auto-DAMP, and /joy consumption all loopback — zero cross-machine latency in the safety path.
robot_state_publisher on the robot. RSP is a pure transform fanout. Putting it onboard means /robot_description (latched) and /tf originate at one address; host-side viewers subscribe over the wire. Bandwidth is small (kinematic tree, not point clouds).

Network assumptions

Wired Ethernet tether only. WiFi as the operator-to-robot link is explicitly not supported; if the requirement appears, gamepad-on-robot is doubly justified and DDS QoS needs separate tuning.
Both machines run the same ROS_DOMAIN_ID and (recommended) the same RMW_IMPLEMENTATION. Cyclone DDS is the recommended pick; pin the network interface in the XML config so discovery doesn't leak onto management NICs.

AGENTS.md §"Deployment topology" carries the canonical version of this split (with an ASCII process diagram); this page reflects the deployed surface.

mode_manager source — the FSM is ~150 lines of C++; readable in one sitting.
Lite 101 — see all of this run end-to-end against mock hardware and MuJoCo.
Controllers reference — per-controller parameter tables.

Software framework

Module dependency overview​

The ros2_control cycle​

Where the data actually lives​

Five-mode finite state machine​

Transition mechanics​

Two parallel policy tiers​

Frozen schemas​

Safety and fault handling​

Deployment topology​

What crosses the tether​

Why this split (the three judgment calls)​

Network assumptions​

Next​