Support remote sandbox skill script staging#918
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to stage and execute skill execution assets in remote sandboxes. It includes utility modules to discover, manifest, and hash execution assets, a synchronization module to copy these assets to remote sandboxes (handling binary files via base64 and preserving POSIX permissions), and path-rewriting logic in the MCP server runner to map local or virtual skill paths to remote locations during terminal tool execution. The reviewer's feedback highlights critical cross-platform compatibility issues on Windows hosts due to backslash/forward-slash mismatches, memory safety concerns when hashing large binary assets, and several defensive type-checking opportunities to prevent runtime crashes.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| host_file_paths = { | ||
| str((Path(asset_root) / relative_path).resolve()): relative_path | ||
| for relative_path in relative_paths | ||
| } |
There was a problem hiding this comment.
On Windows hosts, Path.resolve() resolves paths using backslashes (e.g., C:\path\to\skill\run.py). However, LLM-generated commands or tool parameters often use forward slashes (e.g., C:/path/to/skill/run.py). This mismatch causes host_path in rewritten to fail, completely skipping the path rewriting for those files. Registering both forward-slash and backward-slash normalized versions of the host paths ensures robust cross-platform path rewriting.
host_file_paths = {}
for relative_path in relative_paths:
resolved_path = str((Path(asset_root) / relative_path).resolve())
host_file_paths[resolved_path] = relative_path
resolved_path_forward = resolved_path.replace("\\", "/")
if resolved_path_forward != resolved_path:
host_file_paths[resolved_path_forward] = relative_path| root_referenced = asset_root in rewritten or any( | ||
| self._skill_root_occurs_in_command(rewritten, alias) | ||
| for alias in path_aliases | ||
| ) | ||
| file_referenced = any(host_path in rewritten for host_path in host_file_paths) |
There was a problem hiding this comment.
Similarly to the file path mismatch, asset_root can contain backslashes on Windows, whereas the command text might use forward slashes. Checking both the native and forward-slash normalized versions of asset_root ensures that root references are correctly detected.
asset_root_forward = asset_root.replace("\\", "/")
root_referenced = (
asset_root in rewritten
or (asset_root_forward != asset_root and asset_root_forward in rewritten)
or any(
self._skill_root_occurs_in_command(rewritten, alias)
for alias in path_aliases
)
)
file_referenced = any(host_path in rewritten for host_path in host_file_paths)| rewritten = self._rewrite_cd_command_root( | ||
| rewritten, | ||
| asset_root, | ||
| remote_root, | ||
| ) |
There was a problem hiding this comment.
To ensure that cd commands targeting the asset root are correctly rewritten on Windows when forward slashes are used, we should also rewrite the forward-slash normalized version of asset_root.
| rewritten = self._rewrite_cd_command_root( | |
| rewritten, | |
| asset_root, | |
| remote_root, | |
| ) | |
| rewritten = self._rewrite_cd_command_root( | |
| rewritten, | |
| asset_root, | |
| remote_root, | |
| ) | |
| if asset_root_forward != asset_root: | |
| rewritten = self._rewrite_cd_command_root( | |
| rewritten, | |
| asset_root_forward, | |
| remote_root, | |
| ) |
| asset_root = Path(str(skill_config.get("asset_root", "") or "")).resolve() | ||
| if not asset_root.exists(): | ||
| raise RuntimeError(f"Skill '{skill_name}' asset root does not exist: {asset_root}") |
There was a problem hiding this comment.
If asset_root is empty or None in skill_config, Path("") resolves to the current working directory (Path.cwd()). Since the current working directory always exists, asset_root.exists() will evaluate to True, leading to a silent fallback and potentially syncing unintended files from the host's working directory. We should explicitly validate that asset_root is configured and non-empty.
| asset_root = Path(str(skill_config.get("asset_root", "") or "")).resolve() | |
| if not asset_root.exists(): | |
| raise RuntimeError(f"Skill '{skill_name}' asset root does not exist: {asset_root}") | |
| asset_root_str = skill_config.get("asset_root") | |
| if not asset_root_str: | |
| raise RuntimeError(f"Skill '{skill_name}' asset_root is not configured") | |
| asset_root = Path(asset_root_str).resolve() | |
| if not asset_root.exists(): | |
| raise RuntimeError(f"Skill '{skill_name}' asset root does not exist: {asset_root}") |
| for rel_path in manifest.relative_paths: | ||
| asset_path = manifest.root / rel_path | ||
| mode_bits = asset_path.stat().st_mode & 0o777 | ||
| hasher.update(rel_path.encode("utf-8")) | ||
| hasher.update(b"\0") | ||
| hasher.update(f"{mode_bits:o}".encode("ascii")) | ||
| hasher.update(b"\0") | ||
| hasher.update(asset_path.read_bytes()) | ||
| hasher.update(b"\0") |
There was a problem hiding this comment.
asset_path.read_bytes() reads the entire file into memory. If a skill contains large binary assets (such as models, datasets, or large images), this can cause high memory usage or Out-Of-Memory (OOM) errors. Reading and hashing the file in chunks (e.g., 64KB) is much more memory-efficient and safer.
| for rel_path in manifest.relative_paths: | |
| asset_path = manifest.root / rel_path | |
| mode_bits = asset_path.stat().st_mode & 0o777 | |
| hasher.update(rel_path.encode("utf-8")) | |
| hasher.update(b"\0") | |
| hasher.update(f"{mode_bits:o}".encode("ascii")) | |
| hasher.update(b"\0") | |
| hasher.update(asset_path.read_bytes()) | |
| hasher.update(b"\0") | |
| for rel_path in manifest.relative_paths: | |
| asset_path = manifest.root / rel_path | |
| mode_bits = asset_path.stat().st_mode & 0o777 | |
| hasher.update(rel_path.encode("utf-8")) | |
| hasher.update(b"\0") | |
| hasher.update(f"{mode_bits:o}".encode("ascii")) | |
| hasher.update(b"\0") | |
| with asset_path.open("rb") as f: | |
| while chunk := f.read(65536): | |
| hasher.update(chunk) | |
| hasher.update(b"\0") |
| # Ensure tool_list is loaded | ||
| if not self.tool_list or not context: | ||
| if not self.tool_list: | ||
| return False |
There was a problem hiding this comment.
If parameter is explicitly passed as None or is not a dictionary (e.g., due to an invalid tool call payload), attempting to assign keys like parameter["session_id"] will raise a TypeError. Adding a defensive type check prevents potential runtime crashes.
| # Ensure tool_list is loaded | |
| if not self.tool_list or not context: | |
| if not self.tool_list: | |
| return False | |
| # Ensure tool_list is loaded | |
| if not self.tool_list or not isinstance(parameter, dict): | |
| return False |
| async def _require_success(result: dict[str, Any], phase: str) -> None: | ||
| if result and result.get("success", True): | ||
| return |
There was a problem hiding this comment.
result is assumed to be a dictionary when calling result.get("success", True). If result is a string, list, or other non-dict type (e.g., due to an unexpected response or test mock), this will raise an AttributeError. Checking isinstance(result, dict) first makes this helper much more robust.
| async def _require_success(result: dict[str, Any], phase: str) -> None: | |
| if result and result.get("success", True): | |
| return | |
| async def _require_success(result: Any, phase: str) -> None: | |
| if isinstance(result, dict) and result.get("success", True): | |
| return |
| def _resolve_entrypoint_metadata(self, skill_id: str) -> dict[str, Any]: | ||
| entrypoint_id = skill_id.split(":", 1)[1] if ":" in skill_id else skill_id | ||
| for entrypoint in self._plugin.manifest.entrypoints.get("skills", ()): | ||
| if entrypoint.entrypoint_id == entrypoint_id: | ||
| return dict(entrypoint.metadata or {}) | ||
| return {} |
There was a problem hiding this comment.
If entrypoints is missing or None in the plugin manifest, calling self._plugin.manifest.entrypoints.get will raise an AttributeError. Using getattr with a default value prevents potential crashes.
| def _resolve_entrypoint_metadata(self, skill_id: str) -> dict[str, Any]: | |
| entrypoint_id = skill_id.split(":", 1)[1] if ":" in skill_id else skill_id | |
| for entrypoint in self._plugin.manifest.entrypoints.get("skills", ()): | |
| if entrypoint.entrypoint_id == entrypoint_id: | |
| return dict(entrypoint.metadata or {}) | |
| return {} | |
| def _resolve_entrypoint_metadata(self, skill_id: str) -> dict[str, Any]: | |
| entrypoint_id = skill_id.split(":", 1)[1] if ":" in skill_id else skill_id | |
| entrypoints = getattr(self._plugin.manifest, "entrypoints", None) or {} | |
| for entrypoint in entrypoints.get("skills", ()): | |
| if entrypoint.entrypoint_id == entrypoint_id: | |
| return dict(entrypoint.metadata or {}) | |
| return {} |
Summary
Test Plan