ADR 0004: Operator Feature Parity¶
Status¶
Proposed
Context¶
The current rune-operator CRD (RuneBenchmark) is missing key fields required to achieve full feature parity with the rune core engine. Specifically, it lacks:
1. Agent Routing (Agent): The ability to specify which agent to run (currently locked to HolmesGPT).
2. Cost Safety (Pre-Flight Estimates): The reconciliation loop submits jobs without checking the fail-closed cost estimation gates.
3. Attestations (AttestationRequired): The ability to demand SLSA L3 signed provenance from the core engine.
Decision¶
To bring the Operator up to the state of the art defined by the platform's API:
1. Update api/v1alpha1/runebenchmark_types.go to include Agent string and AttestationRequired bool.
2. Modify controllers/runebenchmark_controller.go to explicitly issue a POST /v1/estimates call and halt reconciliation if the confidence score is < 0.95.
Scope of the estimates pre-flight gate¶
The POST /v1/estimates call must be issued only for workflows that involve cloud cost — specifically when spec.VastAI is true. For local-only workflows (VastAI: false), the estimates call is skipped because there is no cloud cost to gate. This avoids creating a hard dependency on the estimates endpoint for zero-cost jobs.
Field mapping: CRD spec to CostEstimationRequest¶
CRD field (RuneBenchmarkSpec) |
Estimates field (CostEstimationRequest) |
Notes |
|---|---|---|
VastAI |
vastai |
Direct mapping |
MinDPH |
min_dph |
Direct mapping |
MaxDPH |
max_dph |
Direct mapping |
Model |
model |
Direct mapping |
TimeoutSeconds |
estimated_duration_seconds |
Use TimeoutSeconds as the duration estimate; default 3600 if unset |
AWS, GCP, Azure, and local hardware fields should be set to their zero-value defaults (the operator does not currently support these providers).
Field semantics¶
Agent string: Added toRuneBenchmarkSpec. Used only by theagentic-agentworkflow — inserted intobuildPayload()for that case. Defaults to"holmes"if empty (matching the core API default). Thebenchmarkandollama-instanceworkflows ignore this field.AttestationRequired bool: Added toRuneBenchmarkSpec. Used by thebenchmarkworkflow — forwarded asattestation_requiredinbuildPayload(). Theagentic-agentandollama-instanceworkflows ignore this field.
CRD regeneration¶
The operator uses kubebuilder markers but has no Makefile or controller-gen toolchain wired in. After modifying runebenchmark_types.go:
- DeepCopy (
zz_generated.deepcopy.go):Agent(string) andAttestationRequired(bool) are value types — no pointer handling needed. Update theDeepCopyIntomethod forRuneBenchmarkSpecmanually by adding the two field copies. - CRD YAML (
config/crd/bases/bench.rune.ai_runebenchmarks.yaml): Add the two new properties underspec.propertiesmanually. - Helm CRD distribution: Copy the updated CRD YAML to
rune-charts/charts/rune-operator/crds/(create the directory if needed). Helm applies CRDs before templates automatically.
Sample CR update¶
Update config/samples/bench_v1alpha1_runebenchmark.yaml with:
spec:
agent: "holmes" # Optional: defaults to "holmes" for agentic-agent workflow
attestationRequired: false # Optional: require SLSA L3 attestation for benchmark workflow
Consequences¶
- Requires updating the CRD structure within the existing
v1alpha1API version. Avoid bumping the API version (e.g., tov1alpha2) unless it becomes a hard blocker, to minimize disruption to existing users. - Ensures the Operator adheres to the ML4 cost-safety constraints defined in the
SYSTEM_PROMPT.md. - Allows declarative scheduling of any Tier 1, 2, or 3 agent supported by the ecosystem.
- Helm chart for
rune-operatorgains acrds/directory, enabling CRD distribution via Helm for the first time.