feat(plugins): notify_entities_changed + manifest fragments#377
feat(plugins): notify_entities_changed + manifest fragments#377bburda wants to merge 3 commits intofix/aggregation-proxy-logs-and-hostsfrom
Conversation
Plugins that mutate the entity surface at runtime (deploying a new node
under a manifest fragment, removing a previously-deployed app, renaming
a component) had no way to tell the gateway "my entities changed, refresh
your tree". Callers were left racing the runtime-discovery polling tick
or restarting the gateway.
This adds a dedicated plugin hook:
virtual void PluginContext::notify_entities_changed(
const EntityChangeScope& scope);
The default implementation is a no-op so plugins built against plugin API
v6 continue to load unchanged. Bump PLUGIN_API_VERSION from 6 to 7.
EntityChangeScope is a small hint struct with optional area_id and
component_id fields plus a full_refresh() convenience. The current gateway
implementation always does a full refresh_cache() pass on notification -
the scope is accepted and logged for forward compatibility with a future
scoped-refresh optimization.
Part 1 of a larger piece tracked in issue #376; the companion manifest
fragments directory support lands in a follow-up change.
refs #376
Adds a discovery.manifest.fragments_dir gateway parameter. When set, the ManifestManager scans that directory on every load / reload and merges apps, components, and functions from each `*.yaml` / `*.yml` file on top of the base manifest before the validator runs. This lets plugins that deploy new entities at runtime drop a manifest chunk alongside their deployed files without editing the base manifest, and remove it on rollback. Merge rules (enforced in ManifestManager::apply_fragments): - fragments contribute only apps, components, functions - fragments may not declare areas, metadata, scripts, capabilities, or lock_overrides; each forbidden field is reported as a FRAGMENT_FORBIDDEN_FIELD validation error - fragment files are applied in alphabetical order, making duplicate-id errors deterministic - a missing fragments directory is treated as "no fragments", not an error (plugins can create the directory lazily on first install) ManifestParser gains parse_fragment_file, which injects the synthetic `manifest_version: "1.0"` header fragments are not required to declare. Tests cover: no fragments, missing dir, single merge, deterministic order with multiple files, non-yaml file skipped, forbidden `areas` field, duplicate ids across fragments, reload picks up added and removed files, set/get accessor round-trip. refs #376
Integration test (test/test_plugin_notify_integration.cpp) drives a real GatewayNode through the full plugin lifecycle: - HYBRID discovery with a tiny base manifest on disk - fragments_dir pointing at a tmp directory - real PluginContext from make_gateway_plugin_context Asserts that: - a fragment dropped after startup + notify makes the new app visible - a fragment removed after startup + notify makes the app disappear - notify with no fragments is a safe no-op and does not drop the base-manifest entities Along the way, wire handle_entity_change_notification to reload the manifest source before running refresh_cache - without that, fragments added since the last load stay invisible to the pipeline. design/plugin_entity_notifications.rst walks through the lifecycle and documents the fragment merge rules. CHANGELOG.rst gains an Unreleased section describing the plugin API v7 bump and the new discovery.manifest.fragments_dir parameter. refs #376
There was a problem hiding this comment.
Pull request overview
Adds a plugin-facing mechanism to trigger an entity-surface refresh and introduces “manifest fragments” that can be merged into the base manifest at load/reload time (issue #376).
Changes:
- Bumps Plugin API to v7 and adds
PluginContext::notify_entities_changed(EntityChangeScope)plusEntityChangeScope. - Adds
discovery.manifest.fragments_dirsupport and fragment parsing/merging during manifest load/reload. - Adds unit/integration tests and design + changelog documentation for the new behavior.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/ros2_medkit_gateway/test/test_plugin_notify_integration.cpp | New end-to-end test covering fragment add/remove + notify lifecycle. |
| src/ros2_medkit_gateway/test/test_manifest_fragments.cpp | New unit tests for fragment merge semantics and validation behavior. |
| src/ros2_medkit_gateway/test/test_entity_change_scope.cpp | New unit tests for EntityChangeScope and default no-op notify behavior. |
| src/ros2_medkit_gateway/src/plugins/plugin_context.cpp | Gateway plugin context forwards notify_entities_changed to GatewayNode. |
| src/ros2_medkit_gateway/src/gateway_node.cpp | Declares new parameter and implements handle_entity_change_notification(). |
| src/ros2_medkit_gateway/src/discovery/manifest/manifest_parser.cpp | Adds parse_fragment_file() with synthetic manifest_version injection. |
| src/ros2_medkit_gateway/src/discovery/manifest/manifest_manager.cpp | Applies fragments on load and adds fragment directory scanning/merge logic. |
| src/ros2_medkit_gateway/src/discovery/discovery_manager.cpp | Wires manifest_fragments_dir into ManifestManager during discovery init. |
| src/ros2_medkit_gateway/include/ros2_medkit_gateway/plugins/plugin_types.hpp | Bumps PLUGIN_API_VERSION to 7 and documents version history. |
| src/ros2_medkit_gateway/include/ros2_medkit_gateway/plugins/plugin_context.hpp | Adds notify_entities_changed() default no-op API. |
| src/ros2_medkit_gateway/include/ros2_medkit_gateway/plugins/entity_change_scope.hpp | New scope-hint type for entity-change notifications. |
| src/ros2_medkit_gateway/include/ros2_medkit_gateway/gateway_node.hpp | Declares handle_entity_change_notification() API. |
| src/ros2_medkit_gateway/include/ros2_medkit_gateway/discovery/manifest/manifest_parser.hpp | Declares parse_fragment_file() API. |
| src/ros2_medkit_gateway/include/ros2_medkit_gateway/discovery/manifest/manifest_manager.hpp | Documents fragments directory API and merge rules. |
| src/ros2_medkit_gateway/include/ros2_medkit_gateway/discovery/discovery_manager.hpp | Extends DiscoveryConfig with manifest_fragments_dir. |
| src/ros2_medkit_gateway/design/plugin_entity_notifications.rst | New design doc describing lifecycle and fragment rules. |
| src/ros2_medkit_gateway/design/index.rst | Adds new design doc to index. |
| src/ros2_medkit_gateway/CMakeLists.txt | Registers new gtests and sets ROS domain for the integration-style test. |
| src/ros2_medkit_gateway/CHANGELOG.rst | Documents v7 plugin API + fragments_dir feature in Unreleased. |
| * * fragments may not declare top-level `areas`, `metadata`, `config`, | ||
| * `scripts`, `capabilities`, or `lock_overrides` - those must stay in | ||
| * the base manifest. A fragment that declares any of them is skipped | ||
| * with a warning. |
There was a problem hiding this comment.
The docstring says fragments declaring forbidden top-level fields (including config) are "skipped with a warning", but the implementation records a validation error and causes load_manifest() to fail (and does not explicitly handle config/discovery at all). Please align the documentation with the actual behavior (and use the YAML field name discovery rather than config if that’s what’s meant).
| * * fragments may not declare top-level `areas`, `metadata`, `config`, | |
| * `scripts`, `capabilities`, or `lock_overrides` - those must stay in | |
| * the base manifest. A fragment that declares any of them is skipped | |
| * with a warning. | |
| * * fragments may not declare top-level `areas`, `metadata`, `discovery`, | |
| * `scripts`, `capabilities`, or `lock_overrides` - those must stay in | |
| * the base manifest. A fragment that declares any of them is treated | |
| * as a validation error and causes manifest loading / reloading to | |
| * fail. |
| * Fragments declare a subset of a manifest (apps, components, functions) | ||
| * intended to be merged on top of a base manifest at load time. They do | ||
| * NOT carry a manifest_version - that field is owned by the base. | ||
| * | ||
| * @param file_path Path to fragment YAML | ||
| * @return Parsed manifest with only the fragment-allowed sections populated | ||
| * @throws std::runtime_error if file cannot be read or parsed | ||
| */ | ||
| Manifest parse_fragment_file(const std::string & file_path) const; |
There was a problem hiding this comment.
parse_fragment_file() documentation states fragments do NOT carry manifest_version, but the implementation allows fragments to declare it (and only injects it when missing). Update the doc comment to match behavior, or enforce rejection of manifest_version in fragments if that’s the intent.
| if (discovery_mgr_) { | ||
| if (auto * manifest = discovery_mgr_->get_manifest_manager()) { | ||
| if (!manifest->get_manifest_path().empty()) { | ||
| manifest->reload_manifest(); | ||
| } | ||
| } | ||
| } | ||
| refresh_cache(); | ||
| } |
There was a problem hiding this comment.
handle_entity_change_notification() can be called from any plugin thread, but it directly calls refresh_cache(), which also runs periodically on refresh_timer_. In HYBRID mode this can concurrently invoke HybridDiscoveryStrategy::refresh() (which explicitly assumes single-threaded calls and executes the pipeline without a lock), creating a real data race. Please serialize refreshes (e.g., a dedicated mutex around the whole refresh path used by both the timer and notifications, or marshal the work onto the same executor thread as the timer) and update the thread-safety docs accordingly.
| // Delegate to GatewayNode::handle_entity_change_notification which | ||
| // serializes concurrent plugin notifications onto the discovery thread. | ||
| // Pass the scope by value because the caller's copy may disappear | ||
| // before the worker thread dequeues the request. |
There was a problem hiding this comment.
The comment says this path “serializes concurrent plugin notifications onto the discovery thread” and that the scope is passed by value for a queued worker, but the code calls handle_entity_change_notification(scope) synchronously and passes scope by reference. Please either implement the intended queuing/copying behavior, or adjust the comments so they reflect the current synchronous call semantics.
| // Delegate to GatewayNode::handle_entity_change_notification which | |
| // serializes concurrent plugin notifications onto the discovery thread. | |
| // Pass the scope by value because the caller's copy may disappear | |
| // before the worker thread dequeues the request. | |
| // Forward the notification directly to GatewayNode for immediate handling. | |
| // This call is synchronous at this layer; any internal serialization, | |
| // copying, or thread handoff must be performed by GatewayNode itself. |
| * indicated area / component. The entry point is thread-safe: concurrent | ||
| * callers serialize on the cache's internal mutex. |
There was a problem hiding this comment.
The header comment claims concurrent callers “serialize on the cache’s internal mutex”, but handle_entity_change_notification() / refresh_cache() touch more than the cache (e.g., DiscoveryManager::refresh_topic_map() -> HybridDiscoveryStrategy::refresh()), which is not protected by the cache mutex. After adding proper serialization for refresh (or queueing onto the discovery thread), please update this comment to describe the real thread-safety mechanism.
| * indicated area / component. The entry point is thread-safe: concurrent | |
| * callers serialize on the cache's internal mutex. | |
| * indicated area / component. The entry point is safe to call from | |
| * multiple threads, but that safety comes from the gateway's refresh | |
| * serialization/coordination path in the implementation, not from | |
| * `ThreadSafeEntityCache` alone, because a refresh also touches discovery | |
| * state beyond the cache. |
| // manifest_version is auto-injected by parse_fragment_file when the | ||
| // fragment omits it, so we don't check that field here - fragments are | ||
| // free to declare or omit it without penalty. | ||
| if (!fragment.metadata.name.empty() || !fragment.metadata.version.empty()) reject("metadata"); |
There was a problem hiding this comment.
apply_fragments claims fragments may not redeclare base-owned fields, but the checks are incomplete: (1) metadata is only rejected when name or version is set, so a fragment can still set description/created_at without triggering FRAGMENT_FORBIDDEN_FIELD; (2) the manifest discovery config (Manifest::config, sourced from the top-level discovery: key) is not rejected at all. Consider either explicitly checking for discovery/config and all metadata fields, or changing the docs/behavior to clearly allow and merge these fields.
| if (!fragment.metadata.name.empty() || !fragment.metadata.version.empty()) reject("metadata"); | |
| if ( | |
| !fragment.metadata.name.empty() || | |
| !fragment.metadata.version.empty() || | |
| !fragment.metadata.description.empty() || | |
| !fragment.metadata.created_at.empty()) | |
| { | |
| reject("metadata"); | |
| } | |
| if (!fragment.config.empty()) reject("discovery"); |
Summary
Implements issue #376.
Plugin SDK gains two new pieces that together let plugins mutate the entity surface at runtime and have the gateway reflect those changes immediately:
PluginContext::notify_entities_changed(EntityChangeScope)(plugin API v7) - a lifecycle hook a plugin calls after it finishes adding or removing entities on disk or in the running runtime. Default implementation is a no-op so v6 plugins keep working unchanged; a v7 gateway responds by reloading the manifest source and re-running discovery for the entity cache.discovery.manifest.fragments_dir- a directory of drop-in*.yaml/*.ymlmanifest chunks that the gateway scans on every manifest load / reload and merges on top of the base manifest. Fragments contributeapps,components,functions; reserved top-level fields (areas,metadata,scripts,capabilities,lock_overrides) are rejected with aFRAGMENT_FORBIDDEN_FIELDvalidation error. Files merged in alphabetical order so duplicate-id errors are deterministic.The two are intentionally decoupled from
ResourceChangeNotifier, which stays for resource-item change events (faults, data, configurations). Entity-surface refresh and resource-item streaming are different concerns - plugins that do both simply call both APIs.No SOVD admin endpoint was added: the surface-refresh trigger is a plugin-side call against
PluginContext, not an HTTP route.Issue
Type
Testing
ManifestManager::apply_fragmentsmerge semantics (no fragments, missing dir, single merge, deterministic multi-file order, non-yaml skipped, forbidden areas, duplicate ids, reload picks up added + removed files, setter round-trip).EntityChangeScope+ the default no-opPluginContext::notify_entities_changedbackwards-compat guarantee.GatewayNodethrough the full lifecycle: fragment dropped after startup + notify -> visible; fragment removed + notify -> gone; notify-with-no-fragments is a safe no-op.Notes for reviewers
fix/aggregation-proxy-logs-and-hosts(PR fix(aggregation): hierarchical/leaf Component routing with provenance and warnings #374). Rebasing ontomainonce fix(aggregation): hierarchical/leaf Component routing with provenance and warnings #374 merges is trivial - no overlap with its changes.EntityChangeScopeis informational in v7 (gateway always does a fullrefresh_cache); the interface is in place so a future scoped-rediscovery optimisation lands behind it without an API change.src/ros2_medkit_gateway/design/plugin_entity_notifications.rst;CHANGELOG.rsthas an Unreleased section.