Summary
GET /{entity}/status reports ready or notReady from signals that are too coarse, so the
status does not reflect reality.
- App:
ready comes from App::is_online, which is set to true whenever the node name appears in
the ROS 2 graph (get_node_names_and_namespaces). It is unconditional on graph presence. A
managed lifecycle node that is in the graph but inactive or unconfigured still reports
ready, even though it does no work.
- Component: a non-host component is
ready only when at least one hosted app is online, so a
component with zero apps, or with all apps temporarily down, reports notReady even though the
substrate is reachable. The SOVD spec defines notReady as "stopped, restarting, or
unreachable", which a reachable substrate is not.
This issue makes the read accurate. It adds no control: the PUT transitions stay 501 and
actuation is tracked separately.
Proposed solution
App status:
- For a managed lifecycle node (one that exposes
get_state and change_state), derive the
status from the lifecycle state read with GetState: active is ready, any other state or an
unreachable service is notReady. This also covers liveness, because a dead node's service is
gone and reads notReady.
- For a plain node with no lifecycle services, keep
is_online (graph presence). That is the best
signal at the ROS level for a plain node; application-level health is out of scope here.
Component status:
- A local component is
ready while the gateway is serving the request, independent of how many
hosted apps are online, including zero. Remove the "at least one hosted app online" rule. A down
app is that app's own notReady, not the component's.
- Remote components keep current behavior: the status request is forwarded to the peer gateway.
Improves the fidelity of GET /{entity}/status (REQ_INTEROP_076). Read-only. Actuation (lifecycle
transitions, process, container, and host restart) is tracked separately.
Unit and integration tests, docs.
Summary
GET /{entity}/statusreportsreadyornotReadyfrom signals that are too coarse, so thestatus does not reflect reality.
readycomes fromApp::is_online, which is set to true whenever the node name appears inthe ROS 2 graph (
get_node_names_and_namespaces). It is unconditional on graph presence. Amanaged lifecycle node that is in the graph but
inactiveorunconfiguredstill reportsready, even though it does no work.readyonly when at least one hosted app is online, so acomponent with zero apps, or with all apps temporarily down, reports
notReadyeven though thesubstrate is reachable. The SOVD spec defines
notReadyas "stopped, restarting, orunreachable", which a reachable substrate is not.
This issue makes the read accurate. It adds no control: the PUT transitions stay
501andactuation is tracked separately.
Proposed solution
App status:
get_stateandchange_state), derive thestatus from the lifecycle state read with
GetState:activeisready, any other state or anunreachable service is
notReady. This also covers liveness, because a dead node's service isgone and reads
notReady.is_online(graph presence). That is the bestsignal at the ROS level for a plain node; application-level health is out of scope here.
Component status:
readywhile the gateway is serving the request, independent of how manyhosted apps are online, including zero. Remove the "at least one hosted app online" rule. A down
app is that app's own
notReady, not the component's.Improves the fidelity of
GET /{entity}/status(REQ_INTEROP_076). Read-only. Actuation (lifecycletransitions, process, container, and host restart) is tracked separately.
Unit and integration tests, docs.