Manifest Schema is inadequate for multiplexed commands (Subcommands, Contextual Autocompletion, and Dynamic Outputs) #1

Closed
opened 2026-03-17 23:01:58 -06:00 by jorge · 0 comments
Owner

The current TOML manifest schema (as shown in PROGRAMS.md) models commands as a flat list of arguments and options. While this works well for single-purpose utilities like list or search, it breaks down for multiplexed binaries (e.g., git, pkg, network).

Currently, the schema treats subcommands (like commit or install) as standard string arguments. This flat architecture introduces severe limitations in autocompletion, output rendering, and capability security.

1: Subcommands have distinct argument and flag schemas

Right now, the manifest for git specifies a single subcommand argument, followed by an arguments string.

For devs: The terminal has no way to perform ahead-of-time validation. If a user types git commit --branch main, the terminal parses --branch as a valid flag (if defined globally) or an invalid one, without knowing that --branch belongs to checkout, not commit — we would lose the declarative CLI validation that is central to the shell's design.

For users:
Autocompletion is effectively broken. If a user types pkg <TAB>, the shell doesn't know whether to suggest install, remove, or search. If they type pkg install <TAB>, the shell cannot intelligently swap to package-name completion, because it doesn't understand that install changed the context of the subsequent arguments.

2: Context-Blind Dynamic Autocompletion

The manifest allows for type = "dynamic" with a provider = "get_branches" IPC endpoint. However, completions are rarely context-free.

For devs:
If a user types git push origin <TAB>, the terminal needs to query the git autocomplete provider for branches. But to provide accurate results, the daemon needs to know that origin was the preceding argument (so it can fetch remote branches instead of local ones). Furthermore, querying an IPC endpoint for completion might require the program's daemon to spin up or access the filesystem. The current schema provides no mechanism for the terminal to pass the current AST or context state to the IPC provider.

3: Polymorphic Output Fields

PROGRAMS.md specifies that a program declares its output type (e.g., output = "record_list"). The terminal uses this to format the data.

For devs:
git status might emit a record_list containing [file_path, file_state]. git branch might emit a record_list containing [branch_name, upstream_tracking, is_active].
If the output type is declared at the root level of the git manifest, the terminal's formatting engine doesn't know which fields to expect. Downstream pipeline tools (like where or slice) also cannot inspect the manifest to validate user expressions (e.g., git branch | where is_active = true) before execution.

4: Security and Capability Over-scoping

ZerOS relies on a strict capability-based security model. A process is only granted the capabilities it needs to execute.

For devs:
Consider the pkg command.

  • pkg search needs network access (or read access to /var/pkg_index).
  • pkg install needs network access and write access to /pkg/.

If capabilities are resolved at the program level before the ELF binary is spawned via sys_spawn, the OS must grant write access to /pkg/ every time the user runs pkg search. This violates the principle of least privilege. The OS needs a way to scope required capabilities to the specific subcommand being invoked, before the process is spawned.

Examples to Test Against

Any proposed redesign of the TOML schema and shell parser must be able to gracefully handle the following scenarios:

  1. ping 1.1.1.1 (Standard arg validation)
  2. git push origin main (Subcommand with its own positional args and contextual completion)
  3. pkg install <TAB> (Subcommand requiring dynamic completion and elevated filesystem capabilities)
  4. system vs system --raw (Ensuring the global flag doesn't interfere with subcommand parsing)
The current TOML manifest schema (as shown in `PROGRAMS.md`) models commands as a flat list of arguments and options. While this works well for single-purpose utilities like `list` or `search`, it breaks down for multiplexed binaries (e.g., `git`, `pkg`, `network`). Currently, the schema treats subcommands (like `commit` or `install`) as standard string arguments. This flat architecture introduces severe limitations in autocompletion, output rendering, and capability security. ## 1: Subcommands have distinct argument and flag schemas Right now, the manifest for `git` specifies a single `subcommand` argument, followed by an `arguments` string. **For devs:** The terminal has no way to perform ahead-of-time validation. If a user types `git commit --branch main`, the terminal parses `--branch` as a valid flag (if defined globally) or an invalid one, without knowing that `--branch` belongs to `checkout`, not `commit` — we would lose the declarative CLI validation that is central to the shell's design. **For users:** Autocompletion is effectively broken. If a user types `pkg <TAB>`, the shell doesn't know whether to suggest `install`, `remove`, or `search`. If they type `pkg install <TAB>`, the shell cannot intelligently swap to package-name completion, because it doesn't understand that `install` changed the context of the subsequent arguments. ## 2: Context-Blind Dynamic Autocompletion The manifest allows for `type = "dynamic"` with a `provider = "get_branches"` IPC endpoint. However, completions are rarely context-free. **For devs:** If a user types `git push origin <TAB>`, the terminal needs to query the `git` autocomplete provider for branches. But to provide accurate results, the daemon needs to know that `origin` was the preceding argument (so it can fetch remote branches instead of local ones). Furthermore, querying an IPC endpoint for completion might require the program's daemon to spin up or access the filesystem. The current schema provides no mechanism for the terminal to pass the current AST or context state to the IPC provider. ## 3: Polymorphic Output Fields `PROGRAMS.md` specifies that a program declares its output type (e.g., `output = "record_list"`). The terminal uses this to format the data. **For devs:** `git status` might emit a `record_list` containing `[file_path, file_state]`. `git branch` might emit a `record_list` containing `[branch_name, upstream_tracking, is_active]`. If the output type is declared at the root level of the `git` manifest, the terminal's formatting engine doesn't know which fields to expect. Downstream pipeline tools (like `where` or `slice`) also cannot inspect the manifest to validate user expressions (e.g., `git branch | where is_active = true`) before execution. ## 4: Security and Capability Over-scoping ZerOS relies on a strict capability-based security model. A process is only granted the capabilities it needs to execute. **For devs:** Consider the `pkg` command. * `pkg search` needs network access (or read access to `/var/pkg_index`). * `pkg install` needs network access *and* write access to `/pkg/`. If capabilities are resolved at the program level before the ELF binary is spawned via `sys_spawn`, the OS must grant write access to `/pkg/` every time the user runs `pkg search`. This violates the principle of least privilege. The OS needs a way to scope required capabilities to the specific subcommand being invoked, before the process is spawned. ## Examples to Test Against Any proposed redesign of the TOML schema and shell parser must be able to gracefully handle the following scenarios: 1. `ping 1.1.1.1` (Standard arg validation) 2. `git push origin main` (Subcommand with its own positional args and contextual completion) 3. `pkg install <TAB>` (Subcommand requiring dynamic completion and elevated filesystem capabilities) 4. `system` vs `system --raw` (Ensuring the global flag doesn't interfere with subcommand parsing)
jorge self-assigned this 2026-03-17 23:01:58 -06:00
jorge closed this issue 2026-03-18 13:45:29 -06:00
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
zero/ZerOS#1
No description provided.