Skip to main content
CategoryStatusCreatedAuthor
PoliciesDraft2026-03-13Justin Brooks

Summary

Add an ephemeral plan API that CI pipelines call on pull requests to compute full rendered diffs for each release target — showing exactly what Kubernetes manifests, Terraform resources, or other deployed artifacts would change, like terraform plan output. Results can optionally be posted back to GitHub as PR comments or check runs. No version is created; the plan is computed on the fly and returned to the caller.

Motivation

When a developer opens a pull request that will eventually become a new deployment version, two questions arise before merging:
  1. Which release targets will this version affect?
  2. What exactly will change on each affected target?
Today, neither question can be answered without merging the PR, creating the version, and letting the full promotion lifecycle run. The deployer may have intuition about the impact, but there is no way to get a concrete, rendered diff — the kind of output terraform plan provides — before committing to a deployment. RFC 0002 introduces the Plannable interface on job agents, which can compute rendered output without dispatching a job. But RFC 0002 focuses on the reconciler: plans are computed during the promotion lifecycle to detect no-diff targets and fast-track them. There is no way to trigger a plan before a version exists.

The PR workflow gap

The typical CI workflow for ctrlplane today:
Developer opens PR
  → CI builds artifact
  → PR is reviewed and merged
  → CI creates version (POST /v1/.../versions, status: ready)
  → ctrlplane creates releases for ALL release targets
  → full promotion lifecycle runs (staging → verification → approval → production)
  → deployer discovers which targets were actually affected
The deployer only learns what changed after committing to deployment. For large deployments with tens or hundreds of release targets, this is a significant blind spot. A PR that changes a single Helm values file for one service triggers releases across every cluster, and the deployer won’t know which clusters are truly impacted until the pipeline is running. terraform plan solved this for infrastructure: before applying, you see the full execution plan with resource-level diffs. The same pattern should exist for ctrlplane deployments.

What “plan” means in this context

A dry-run plan computes, for each release target, the full rendered output that the external system (ArgoCD, Terraform Cloud, etc.) would produce for the proposed version — then diffs it against the current deployed state. This is not a hash comparison (RFC 0002) or an affected/unaffected classification. It is the actual diff content:
  • For ArgoCD: the per-resource Kubernetes manifest diff (like argocd app diff)
  • For Terraform Cloud: the resource-level before/after diff (like terraform plan)
  • For Helm: the rendered template diff (like helm diff upgrade)
The diff is what the deployer reviews on the PR, the same way they review terraform plan output today.

Relationship to prior RFCs

  • RFC 0001 (Scoped Versions) — The deployer declares which targets a version affects. Dry-run plans can inform that decision: review the plan on the PR, then create the version with a targetSelector that matches only the affected targets.
  • RFC 0002 (Plan-Based Diff Detection) — Provides the Plannable interface and agent implementations that this RFC consumes. RFC 0002 runs plans inside the reconciler; this RFC exposes plans via an API endpoint before any version exists.

Proposal

API

Add a new endpoint that accepts proposed version data and returns rendered diffs per release target. Nothing is persisted — the plan is ephemeral. Endpoint:
POST /v1/workspaces/{workspaceId}/deployments/{deploymentId}/plan
Request body:
{
  "tag": "pr-123-abc123",
  "config": {},
  "jobAgentConfig": {},
  "metadata": {
    "pr": "123",
    "commit": "abc123"
  }
}
The fields mirror the version creation endpoint but no version row is inserted. The API constructs a transient version object in memory and uses it to build dispatch contexts. Synchronous response (when all agents complete quickly):
{
  "id": "plan_abc123",
  "status": "completed",
  "summary": {
    "total": 50,
    "changed": 3,
    "unchanged": 47,
    "errored": 0,
    "resourceChanges": {
      "add": 1,
      "modify": 4,
      "delete": 0
    }
  },
  "targets": [
    {
      "environmentId": "env_prod",
      "environmentName": "production",
      "resourceId": "res_use1",
      "resourceName": "us-east-1-cluster",
      "hasChanges": true,
      "diff": {
        "raw": "--- current\n+++ proposed\n@@ -12,3 +12,3 @@\n-  image: payments:v1.2.3\n+  image: payments:v1.2.4\n",
        "resources": [
          {
            "kind": "Deployment",
            "name": "payment-api",
            "namespace": "payments",
            "action": "modify",
            "diff": "--- current\n+++ proposed\n@@ -12,3 +12,3 @@\n-  image: payments:v1.2.3\n+  image: payments:v1.2.4\n"
          }
        ]
      }
    },
    {
      "environmentId": "env_prod",
      "environmentName": "production",
      "resourceId": "res_euw1",
      "resourceName": "eu-west-1-cluster",
      "hasChanges": false,
      "diff": null
    }
  ]
}
Each target in the response includes:
  • hasChanges — whether the rendered output differs from the current state
  • diff.raw — human-readable unified diff of the full rendered output
  • diff.resources — structured breakdown of per-resource changes with kind, name, namespace, action (add/modify/delete), and a per-resource diff
Async response (when agents require slow external calls):
{
  "id": "plan_abc123",
  "status": "computing",
  "summary": null,
  "targets": []
}
The CI polls GET /v1/workspaces/{workspaceId}/deployments/{deploymentId}/plan/{planId} until status transitions to completed or failed.

Extended PlanResult type

RFC 0002 defines PlanResult with ContentHash, HasChanges, and a simple Diff string. The dry-run plan requires richer diff data. The type is extended:
type PlanResult struct {
    ContentHash    string
    HasChanges     bool
    RenderedOutput string
    Diff           *PlanDiff
}

type PlanDiff struct {
    Raw       string
    Resources []ResourceChange
}

type ResourceChange struct {
    Kind      string // "Deployment", "Service", "aws_iam_policy"
    Name      string // "payment-api", "module.vpc.aws_subnet"
    Namespace string // Kubernetes namespace, empty for non-k8s
    Action    string // "add", "modify", "delete", "no-op"
    Before    string // Rendered YAML/JSON before (empty for adds)
    After     string // Rendered YAML/JSON after (empty for deletes)
    Diff      string // Unified diff for this resource
}
RFC 0002’s reconciler integration only uses ContentHash and HasChanges. The additional fields (RenderedOutput, Diff) are populated by agents when called through the dry-run plan API and ignored by the reconciler path.

How agents produce diffs

The Plannable interface from RFC 0002 is unchanged — agents return a PlanResult. The difference is what the caller does with it:
  • Reconciler (RFC 0002): Only inspects ContentHash and HasChanges.
  • Dry-run plan API (this RFC): Inspects the full PlanDiff and returns it to the caller.
Agents that want to participate in dry-run plans must populate the Diff field. Agents that only implement hash-based comparison (no diff capability) can still participate — the API response will show hasChanges: true/false but diff will be null.

ArgoCD

The ArgoCD agent calls the ArgoCD API to produce a real diff. The in-process TemplateApplication function only renders the Application CRD (which always differs because targetRevision changes). The actual diff lives in the Kubernetes manifests that ArgoCD produces after fetching the git repo and rendering the Helm chart or kustomize overlay. The Plan implementation uses a temporary Application strategy. Calling GetManifests on the existing Application only overrides the revision — it does not pick up changes to Helm values, parameters, kustomize patches, or any other spec field derived from deployment variables. To get a fully accurate manifest diff for any kind of change (revision, variables, config), the agent creates a short-lived Application with auto-sync disabled, waits for ArgoCD to render manifests for it, fetches those manifests, then cleans it up. The flow:
  1. Renders the proposed Application CRD from the dispatch context (same as dispatch time). This CRD reflects all variable and config changes.
  2. Strips any auto-sync policy and sets the sync policy to manual, so the temporary Application will never deploy to the cluster.
  3. Creates the temporary Application in ArgoCD with a deterministic plan-scoped name (e.g., <original-name>-plan-<short-hash>).
  4. Waits for ArgoCD to compute the desired manifests for the temporary Application. ArgoCD fetches the git repo, renders Helm/kustomize with the full proposed spec (including new values, parameters, revisions), and populates the manifest cache.
  5. Calls GetManifests on the temporary Application to retrieve the fully rendered proposed manifests.
  6. Calls GetManifests on the original Application to retrieve the current manifests.
  7. Deletes the temporary Application.
  8. Computes a per-resource unified diff between the two manifest sets.
Multi-source Applications. ArgoCD v2.6+ supports spec.sources (plural) for Applications that pull from multiple Git repos or Helm charts (e.g., a chart from one repo and values from another). Because the temporary Application is created from the full rendered spec, multi-source applications are handled naturally — the proposed spec’s sources list (with all target revisions) is preserved as-is.
const (
    planLabelKey      = "ctrlplane.dev/plan"
    planCreatedAtKey  = "ctrlplane.dev/plan-created-at"
    planTTL           = 10 * time.Minute
)

func planAppName(originalName string) string {
    h := sha256.Sum256([]byte(originalName + time.Now().String()))
    return fmt.Sprintf("%s-plan-%s", originalName, hex.EncodeToString(h[:4]))
}

func prepareTmpApp(app *v1alpha1.Application, tmpName string) *v1alpha1.Application {
    tmp := app.DeepCopy()
    tmp.Name = tmpName
    tmp.ResourceVersion = ""

    if tmp.Labels == nil {
        tmp.Labels = map[string]string{}
    }
    tmp.Labels[planLabelKey] = "true"
    if tmp.Annotations == nil {
        tmp.Annotations = map[string]string{}
    }
    tmp.Annotations[planCreatedAtKey] = time.Now().UTC().Format(time.RFC3339)

    tmp.Spec.SyncPolicy = &v1alpha1.SyncPolicy{Automated: nil}
    tmp.Operation = nil
    return tmp
}

func (a *ArgoApplication) Plan(
    ctx context.Context,
    dispatchCtx *oapi.DispatchContext,
) (*types.PlanResult, error) {
    serverAddr, apiKey, template, err := ParseJobAgentConfig(
        dispatchCtx.JobAgentConfig,
    )
    if err != nil {
        return nil, err
    }

    proposedApp, err := TemplateApplication(dispatchCtx, template)
    if err != nil {
        return nil, err
    }
    MakeApplicationK8sCompatible(proposedApp)

    client, err := argocdclient.NewClient(&argocdclient.ClientOptions{
        ServerAddr: serverAddr,
        AuthToken:  apiKey,
    })
    if err != nil {
        return nil, fmt.Errorf("create ArgoCD client: %w", err)
    }
    ioCloser, appClient, err := client.NewApplicationClient()
    if err != nil {
        return nil, fmt.Errorf("create application client: %w", err)
    }
    defer ioCloser.Close()

    originalName := proposedApp.Name
    tmpName := planAppName(originalName)
    tmpApp := prepareTmpApp(proposedApp, tmpName)

    upsert := true
    _, err = appClient.Create(ctx, &argocdapplication.ApplicationCreateRequest{
        Application: tmpApp,
        Upsert:      &upsert,
    })
    if err != nil {
        return nil, fmt.Errorf("create temporary plan application: %w", err)
    }
    defer func() {
        cascade := false
        _, _ = appClient.Delete(ctx, &argocdapplication.ApplicationDeleteRequest{
            Name:    &tmpName,
            Cascade: &cascade,
        })
    }()

    if err := waitForManifests(ctx, appClient, tmpName); err != nil {
        return nil, fmt.Errorf("wait for temporary app manifests: %w", err)
    }

    proposedManifests, err := appClient.GetManifests(ctx,
        &argocdapplication.ApplicationManifestQuery{Name: &tmpName},
    )
    if err != nil {
        return nil, fmt.Errorf("get proposed manifests: %w", err)
    }

    currentManifests, err := appClient.GetManifests(ctx,
        &argocdapplication.ApplicationManifestQuery{Name: &originalName},
    )
    if err != nil {
        return buildAddAllResult(proposedManifests)
    }

    return diffManifestSets(currentManifests.Manifests, proposedManifests.Manifests)
}
The waitForManifests helper polls the temporary Application until ArgoCD reports a non-empty manifest set or the context deadline expires:
func waitForManifests(
    ctx context.Context,
    appClient argocdapplication.ApplicationServiceClient,
    name string,
) error {
    ticker := time.NewTicker(2 * time.Second)
    defer ticker.Stop()
    for {
        select {
        case <-ctx.Done():
            return ctx.Err()
        case <-ticker.C:
            resp, err := appClient.GetManifests(ctx,
                &argocdapplication.ApplicationManifestQuery{Name: &name},
            )
            if err != nil {
                continue
            }
            if len(resp.Manifests) > 0 {
                return nil
            }
        }
    }
}
Cleanup guarantees. The temporary Application is deleted in a defer with cascade: false (the Application never synced, so there are no cluster resources to remove). If the agent crashes before cleanup, orphaned Applications remain in ArgoCD. ArgoCD has no native TTL mechanism for Applications — cleanup of orphans is ctrlplane’s responsibility. Every temporary Application is labelled ctrlplane.dev/plan: "true" and annotated with ctrlplane.dev/plan-created-at: <RFC3339 timestamp>. A background goroutine in the workspace engine periodically lists Applications matching the plan label, parses the created-at annotation, and deletes any older than planTTL (default 10 minutes):
func (gc *PlanAppGC) Run(ctx context.Context) {
    ticker := time.NewTicker(1 * time.Minute)
    defer ticker.Stop()
    for {
        select {
        case <-ctx.Done():
            return
        case <-ticker.C:
            gc.cleanup(ctx)
        }
    }
}

func (gc *PlanAppGC) cleanup(ctx context.Context) {
    selector := fmt.Sprintf("%s=true", planLabelKey)
    apps, err := gc.appClient.List(ctx, &argocdapplication.ApplicationQuery{
        Selector: &selector,
    })
    if err != nil {
        log.Warn("plan GC: list failed", "error", err)
        return
    }
    for _, app := range apps.Items {
        createdAt, err := time.Parse(time.RFC3339, app.Annotations[planCreatedAtKey])
        if err != nil || time.Since(createdAt) < planTTL {
            continue
        }
        cascade := false
        name := app.Name
        _, _ = gc.appClient.Delete(ctx, &argocdapplication.ApplicationDeleteRequest{
            Name:    &name,
            Cascade: &cascade,
        })
        log.Info("plan GC: deleted orphaned plan app", "name", name,
            "age", time.Since(createdAt))
    }
}
The GC runs per ArgoCD server. When the workspace engine starts, it registers a PlanAppGC instance for each configured ArgoCD connection. The diffManifestSets function parses each manifest as a Kubernetes resource, matches resources by apiVersion/kind/namespace/name, and produces a ResourceChange for each:
  • Resources in proposed but not current → action: "add"
  • Resources in current but not proposed → action: "delete"
  • Resources in both with different content → action: "modify" with unified diff
  • Resources in both with identical content → omitted (no-op)

Terraform Cloud

Terraform Cloud speculative plans already produce structured diff output. The Plan implementation triggers a speculative plan run and maps the result:
func (t *TerraformCloud) Plan(
    ctx context.Context,
    dispatchCtx *oapi.DispatchContext,
) (*types.PlanResult, error) {
    run, err := t.client.CreateRun(ctx, RunConfig{
        IsDestroy: false,
        PlanOnly:  true,
        Variables: dispatchCtx.Variables,
    })
    if err != nil {
        return nil, err
    }

    plan, err := t.client.WaitForPlan(ctx, run.ID)
    if err != nil {
        return nil, err
    }

    resources := make([]types.ResourceChange, 0, len(plan.ResourceChanges))
    for _, rc := range plan.ResourceChanges {
        resources = append(resources, types.ResourceChange{
            Kind:   rc.Type,
            Name:   rc.Address,
            Action: mapTerraformAction(rc.Change.Actions),
            Before: rc.Change.Before,
            After:  rc.Change.After,
            Diff:   rc.Change.Diff,
        })
    }

    hasChanges := plan.ResourceAdditions > 0 ||
        plan.ResourceChanges > 0 ||
        plan.ResourceDestructions > 0

    return &types.PlanResult{
        ContentHash: plan.StateHash,
        HasChanges:  hasChanges,
        Diff: &types.PlanDiff{
            Raw:       plan.HumanReadableOutput,
            Resources: resources,
        },
    }, nil
}

GitHub Actions / unsupported agents

Agents that do not implement Plannable return nil from the registry’s Plan method. The dry-run plan endpoint reports these targets as:
{
  "resourceName": "some-target",
  "hasChanges": null,
  "diff": null,
  "status": "unsupported"
}
The CI can still post a PR comment noting that some targets could not be planned.

Plan execution flow

The plan endpoint does not create a version or trigger the reconciler. It constructs the necessary context in-memory and calls agents directly:
1. Parse request body into transient version object
2. Look up deployment and its job agents
3. Find all release targets for this deployment
    (same query as enqueueReleaseTargetsForDeployment)
4. For each release target:
    a. Resolve variables (reuse variableresolver.Resolve)
    b. Build DispatchContext (reuse jobs.Factory.BuildDispatchContext)
    c. Call registry.Plan(agentType, dispatchCtx)
    d. Collect PlanResult
5. Aggregate results into response
6. If github field present, post results to PR
For agents with fast plan steps (ArgoCD with cached manifests), the endpoint can complete synchronously. For slow agents (Terraform Cloud speculative plans taking minutes), the endpoint:
  1. Creates a plan record in a lightweight deployment_plan table with status: "computing".
  2. Enqueues plan computation as background work.
  3. Returns the plan ID immediately.
  4. The CI polls GET .../plan/{planId} until status is completed.
CREATE TABLE deployment_plan (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    workspace_id UUID NOT NULL REFERENCES workspace(id) ON DELETE CASCADE,
    deployment_id UUID NOT NULL REFERENCES deployment(id) ON DELETE CASCADE,
    status TEXT NOT NULL DEFAULT 'computing',
    request JSONB NOT NULL,
    result JSONB,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    completed_at TIMESTAMPTZ,
    expires_at TIMESTAMPTZ NOT NULL DEFAULT NOW() + INTERVAL '1 hour'
);
Plans are ephemeral — the expires_at column enables periodic cleanup. No long-term storage is needed.

GitHub integration

When the plan request includes a github field, ctrlplane posts results back to the PR using the GitHub App that is already configured for workflow dispatch: Request with GitHub integration:
{
  "tag": "pr-123-abc123",
  "config": {},
  "metadata": { "pr": "123", "commit": "abc123" },
  "github": {
    "owner": "org",
    "repo": "myapp",
    "sha": "abc123def456",
    "prNumber": 123
  }
}
PR comment format: The comment follows the pattern established by Atlantis and Terraform Cloud, adapted for ctrlplane’s multi-target model:
### Ctrlplane Deployment Plan

**Deployment:** API Service **Version:** pr-123-abc123

| Environment | Resource           | Changes    | Details                  |
| ----------- | ------------------ | ---------- | ------------------------ |
| production  | us-east-1-cluster  | 1 modified | `Deployment/payment-api` |
| production  | eu-west-1-cluster  | No changes | —                        |
| production  | ap-south-1-cluster | No changes | —                        |
| staging     | staging-cluster    | 1 modified | `Deployment/payment-api` |

**Summary:** 2 of 4 targets affected (1 resource modified)

<details>
<summary>us-east-1-cluster diff</summary>

\```diff --- Deployment/payments/payment-api (current) +++
Deployment/payments/payment-api (proposed) @@ -15,3 +15,3 @@ containers: - name:
payment-api

-        image: payments:v1.2.3

*        image: payments:v1.2.4
  \```

</details>
GitHub Check Run (alternative or complement to PR comment): The plan can also be reported as a GitHub Check Run with status success/neutral/failure and structured annotations per changed resource. Check runs integrate with branch protection rules, allowing teams to require a passing plan before merge. Implementation: The existing GitHub App integration in the workspace engine uses the ArgoCD Go client pattern for API calls. The PR comment/check run posting uses the GitHub App’s installation token (the same token acquisition flow used by GoGitHubWorkflowDispatcher in apps/workspace-engine/svc/controllers/jobdispatch/jobagents/github/).

Optional: pull_request webhook handler

As a convenience layer, ctrlplane can optionally react to GitHub pull_request webhook events to auto-trigger plans without CI changes. The GitHub webhook handler in apps/api/src/routes/github/index.ts currently only handles workflow_run events:
if (eventType === "workflow_run")
  await handleWorkflowRunEvent(req.body as WorkflowRunEvent);
Adding a pull_request handler:
if (eventType === "workflow_run")
  await handleWorkflowRunEvent(req.body as WorkflowRunEvent);
else if (eventType === "pull_request")
  await handlePullRequestEvent(req.body as PullRequestEvent);
The PR metadata types already exist in packages/validators/src/github/index.ts (GithubPullRequestVersion, PullRequestMetadataKey, PullRequestConfigKey) but are not wired up to any handler. The handlePullRequestEvent function would:
  1. Extract the repo owner/name and head SHA from the event payload.
  2. Find deployments whose job agent config references this repo (by matching owner and repo fields in the GitHub job agent config).
  3. For each matching deployment, trigger a plan using the head SHA as the proposed version tag.
  4. Post results back as a PR comment or check run.
This is optional — the CI-triggered API is the primary integration path. The webhook handler is a convenience for teams that want automatic plans without modifying their CI pipelines.

Examples

ArgoCD: Helm chart change on a PR

A deployment manages 20 clusters across 4 environments using ArgoCD with a monorepo Helm chart. A developer opens a PR that modifies charts/payment/values.yaml.
# In the CI pipeline triggered by the PR:
curl -X POST \
  "https://api.ctrlplane.dev/v1/workspaces/$WS/deployments/$DEPLOY/plan" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "tag": "pr-456-'$(git rev-parse --short HEAD)'",
    "config": {},
    "metadata": {
      "commit": "'$(git rev-parse HEAD)'",
      "pr": "456",
      "branch": "'$(git branch --show-current)'"
    },
    "github": {
      "owner": "myorg",
      "repo": "platform",
      "sha": "'$(git rev-parse HEAD)'",
      "prNumber": 456
    }
  }'
ctrlplane:
  1. Builds a transient version with the PR’s head commit.
  2. For each of the 20 release targets, calls ArgoCD’s GetManifests API with the PR commit as the target revision.
  3. Diffs the proposed manifests against the currently deployed manifests.
  4. Returns: 4 targets show changes (the clusters running the payment service), 16 show no changes.
  5. Posts a PR comment showing the diff table with expandable per-target diffs.
The developer sees exactly which clusters are affected and what Kubernetes resources change — before merging.

Terraform Cloud: Infrastructure PR

A deployment manages Terraform infrastructure across 3 regions. A PR changes an IAM policy module.
curl -X POST \
  "https://api.ctrlplane.dev/v1/workspaces/$WS/deployments/$DEPLOY/plan" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "tag": "pr-789-abc123",
    "config": {},
    "metadata": { "pr": "789" }
  }'
Response (after async completion):
{
  "id": "plan_xyz",
  "status": "completed",
  "summary": {
    "total": 3,
    "changed": 1,
    "unchanged": 2,
    "errored": 0,
    "resourceChanges": { "add": 0, "modify": 2, "delete": 0 }
  },
  "targets": [
    {
      "environmentName": "production",
      "resourceName": "us-east-1",
      "hasChanges": true,
      "diff": {
        "raw": "Terraform will perform the following actions:\n\n  # aws_iam_policy.service_policy will be updated in-place\n  ~ resource \"aws_iam_policy\" \"service_policy\" {\n      ~ policy = jsonencode(\n          ~ {\n              ~ Statement = [\n                  ~ {\n                      ~ Action = [\n                          + \"s3:GetObject\",\n                        ]\n                    },\n                ]\n            }\n        )\n    }\n\nPlan: 0 to add, 2 to change, 0 to destroy.",
        "resources": [
          {
            "kind": "aws_iam_policy",
            "name": "module.auth.aws_iam_policy.service_policy",
            "action": "modify",
            "diff": "..."
          },
          {
            "kind": "aws_iam_role_policy_attachment",
            "name": "module.auth.aws_iam_role_policy_attachment.service",
            "action": "modify",
            "diff": "..."
          }
        ]
      }
    },
    {
      "environmentName": "production",
      "resourceName": "eu-west-1",
      "hasChanges": false,
      "diff": null
    },
    {
      "environmentName": "production",
      "resourceName": "ap-south-1",
      "hasChanges": false,
      "diff": null
    }
  ]
}
The PR shows that only us-east-1 is affected, with exactly 2 IAM resources changing.

GitHub Actions: Unsupported agent

A deployment uses GitHub Actions (no Plannable implementation). The plan endpoint still runs but cannot produce diffs:
{
  "id": "plan_def",
  "status": "completed",
  "summary": {
    "total": 5,
    "changed": 0,
    "unchanged": 0,
    "errored": 0,
    "unsupported": 5
  },
  "targets": [
    {
      "environmentName": "production",
      "resourceName": "cluster-1",
      "hasChanges": null,
      "diff": null,
      "status": "unsupported"
    }
  ]
}
The CI can still post a PR comment noting that plan output is not available for this deployment type.

Migration

  • The deployment_plan table is new and requires no data migration.
  • Plans are ephemeral with a 1-hour TTL by default. No long-term storage concerns.
  • The Plannable interface (RFC 0002) is unchanged. Agents that already implement it gain dry-run plan support automatically; they only need to populate the Diff field for rich output.
  • The pull_request webhook handler is additive. The existing workflow_run handler is unchanged.
  • No changes to the version creation flow, reconciler, or promotion lifecycle.

Open Questions

  1. Rate limiting. Plans involve external API calls (ArgoCD manifest rendering, Terraform speculative plans). For deployments with many release targets, a single PR could trigger hundreds of external calls. Should there be a per-deployment or per-workspace rate limit on plan requests? Should callers be able to scope the plan to specific environments or resources?
  2. Plan scope. The proposal plans against all release targets. For large deployments, the caller may want to plan only for specific environments or resources. Should the request body accept an optional filter (environmentSelector, resourceSelector) to narrow the plan scope?
  3. Diff format standardization. ArgoCD produces YAML diffs, Terraform produces HCL-style diffs. Should the raw field in PlanDiff be agent-specific (each agent returns its native format), or should ctrlplane normalize to a common diff format?
  4. Cost of plans. Each plan consumes Terraform Cloud compute resources. For deployments with many targets across many PRs, this could become expensive. Should Terraform plans require explicit opt-in per deployment?
  5. Temporary Application permissions. Creating and deleting Applications requires write access to the ArgoCD API. Some teams restrict Application creation to specific ArgoCD projects or RBAC roles. Should the plan Application be created in a dedicated ArgoCD project (e.g., ctrlplane-plans) with limited permissions, or inherit the project from the original Application?
  6. ArgoCD rendering latency. After creating the temporary Application, the agent polls until ArgoCD renders manifests. For large Helm charts or slow git repos this could take significant time. Should there be a configurable timeout per agent, and how should the plan endpoint report rendering timeouts vs. real errors?