✨ e2e: isolate tests with per-scenario dynamic catalogs#2651
✨ e2e: isolate tests with per-scenario dynamic catalogs#2651joelanford wants to merge 8 commits intooperator-framework:mainfrom
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
Reworks the E2E testing approach to eliminate shared, static test catalogs by generating and pushing per-scenario bundles/catalogs at runtime (with scenario-specific naming), enabling safer isolation and paving the way for parallel execution. It also removes the legacy “push static fixtures into a registry” tooling and relocates bundle/CSV testing helpers into a dedicated internal testing package.
Changes:
- Add per-scenario catalog/bundle builder (
test/internal/catalog) and in-cluster registry deploy helper (test/internal/registry) to support dynamic OCI image generation/push. - Convert Gherkin features and step implementations to use dynamic catalogs and
${SCENARIO_ID}/${PACKAGE:...}/${CATALOG:...}substitution. - Remove legacy static catalog/bundle fixtures and push tooling under
testdata/, and update Makefile targets accordingly.
Reviewed changes
Copilot reviewed 77 out of 79 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| testdata/push/push.go | Removed legacy image-building/pushing tool for static test fixtures. |
| testdata/push/go.sum | Removed legacy module deps for the deleted push tool. |
| testdata/push/go.mod | Removed legacy module for the deleted push tool. |
| testdata/push/README.md | Removed documentation for the deleted push tool. |
| testdata/images/catalogs/test-catalog/v2/configs/catalog.yaml | Removed static catalog fixture content. |
| testdata/images/catalogs/test-catalog/v2/configs/.indexignore | Removed static catalog fixture ignore file. |
| testdata/images/catalogs/test-catalog/v1/configs/catalog.yaml | Removed static catalog fixture content. |
| testdata/images/catalogs/test-catalog/v1/configs/.indexignore | Removed static catalog fixture ignore file. |
| testdata/images/bundles/test-operator/v1.2.0/metadata/properties.yaml | Removed static bundle fixture properties. |
| testdata/images/bundles/test-operator/v1.2.0/metadata/annotations.yaml | Removed static bundle fixture annotations. |
| testdata/images/bundles/test-operator/v1.2.0/manifests/testoperator.networkpolicy.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/test-operator/v1.2.0/manifests/testoperator.clusterserviceversion.yaml | Removed static bundle fixture CSV. |
| testdata/images/bundles/test-operator/v1.2.0/manifests/script.configmap.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/test-operator/v1.2.0/manifests/olm.operatorframework.com_olme2etest.yaml | Removed static bundle fixture CRD. |
| testdata/images/bundles/test-operator/v1.2.0/manifests/bundle.configmap.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/test-operator/v1.0.3/metadata/annotations.yaml | Removed static bundle fixture annotations. |
| testdata/images/bundles/test-operator/v1.0.3/manifests/testoperator.clusterserviceversion.yaml | Removed static bundle fixture CSV. |
| testdata/images/bundles/test-operator/v1.0.3/manifests/bundle.configmap.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/test-operator/v1.0.2/metadata/annotations.yaml | Removed static bundle fixture annotations. |
| testdata/images/bundles/test-operator/v1.0.2/manifests/testoperator.networkpolicy.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/test-operator/v1.0.2/manifests/testoperator.clusterserviceversion.yaml | Removed static bundle fixture CSV. |
| testdata/images/bundles/test-operator/v1.0.2/manifests/olm.operatorframework.com_olme2etest.yaml | Removed static bundle fixture CRD. |
| testdata/images/bundles/test-operator/v1.0.2/manifests/bundle.configmap.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/test-operator/v1.0.0/metadata/annotations.yaml | Removed static bundle fixture annotations. |
| testdata/images/bundles/test-operator/v1.0.0/manifests/testoperator.networkpolicy.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/test-operator/v1.0.0/manifests/testoperator.clusterserviceversion.yaml | Removed static bundle fixture CSV. |
| testdata/images/bundles/test-operator/v1.0.0/manifests/script.configmap.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/test-operator/v1.0.0/manifests/olm.operatorframework.com_olme2etest.yaml | Removed static bundle fixture CRD. |
| testdata/images/bundles/test-operator/v1.0.0/manifests/dummy.configmap.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/test-operator/v1.0.0/manifests/bundle.configmap.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/single-namespace-operator/v1.0.0/metadata/annotations.yaml | Removed static bundle fixture annotations. |
| testdata/images/bundles/single-namespace-operator/v1.0.0/manifests/singlenamespaceoperator.clusterserviceversion.yaml | Removed static bundle fixture CSV. |
| testdata/images/bundles/single-namespace-operator/v1.0.0/manifests/olm.operatorframework.com_singlenamespaces.yaml | Removed static bundle fixture CRD. |
| testdata/images/bundles/own-namespace-operator/v1.0.0/metadata/annotations.yaml | Removed static bundle fixture annotations. |
| testdata/images/bundles/own-namespace-operator/v1.0.0/manifests/ownnamespaceoperator.clusterserviceversion.yaml | Removed static bundle fixture CSV. |
| testdata/images/bundles/own-namespace-operator/v1.0.0/manifests/olm.operatorframework.com_ownnamespaces.yaml | Removed static bundle fixture CRD. |
| testdata/images/bundles/large-crd-operator/v1.0.0/metadata/annotations.yaml | Removed static bundle fixture annotations. |
| testdata/images/bundles/large-crd-operator/v1.0.0/manifests/script.configmap.yaml | Removed static bundle fixture manifest. |
| testdata/images/bundles/large-crd-operator/v1.0.0/manifests/largecrdoperator.clusterserviceversion.yaml | Removed static bundle fixture CSV. |
| testdata/build-test-registry.sh | Removed legacy script that deployed registry + push job for static fixtures. |
| testdata/Dockerfile | Removed legacy image build that packaged static fixtures and push binary. |
| test/upgrade-e2e/features/operator-upgrade.feature | Updated upgrade E2E feature to use dynamic per-scenario catalogs/packages. |
| test/internal/registry/registry.go | Added Go helper to deploy an in-cluster TLS-enabled registry via SSA for tests. |
| test/internal/catalog/catalog_test.go | Added unit tests for bundle/catalog generation logic and parameterization. |
| test/internal/catalog/catalog.go | Added catalog builder that generates/pushes per-scenario bundle + catalog OCI images. |
| test/internal/catalog/bundle.go | Added bundle builder producing parameterized manifests (CRD/CSV/Deploy/etc). |
| test/helpers/helpers.go | Removed legacy E2E helper package (static catalog assumptions). |
| test/helpers/feature_gates.go | Removed legacy feature-gate detection helper package. |
| test/extension-developer-e2e/setup.sh | Switched setup to env-var driven config; adjusted build/push commands. |
| test/extension-developer-e2e/extension_developer_test.go | Added TestMain orchestration for registry + setup; updated catalog ref wiring. |
| test/e2e/steps/upgrade_steps.go | Updated/added upgrade steps for per-scenario catalogs and reconciliation checks. |
| test/e2e/steps/testdata/test-catalog-template.yaml | Removed static catalog template. |
| test/e2e/steps/testdata/extra-catalog-template.yaml | Removed static catalog template. |
| test/e2e/steps/hooks.go | Updated scenario context to track per-scenario catalogs/packages; cleanup changes. |
| test/e2e/features_test.go | Increased Godog scenario concurrency (parallel execution). |
| test/e2e/features/user-managed-fields.feature | Updated feature to use per-scenario names and dynamic catalogs. |
| test/e2e/features/update.feature | Updated feature to use per-scenario names and dynamic catalogs/catalog updates. |
| test/e2e/features/uninstall.feature | Updated feature to use per-scenario names and dynamic catalogs. |
| test/e2e/features/status.feature | Updated feature to use per-scenario names and dynamic catalogs. |
| test/e2e/features/recover.feature | Updated feature to use per-scenario names and dynamic catalogs; updated messages. |
| test/e2e/features/install.feature | Updated feature to use per-scenario names and dynamic catalogs; revised scenarios. |
| test/e2e/README.md | Updated E2E docs for new variable substitution and dynamic catalog architecture. |
| internal/testing/bundle/fs/bundlefs_test.go | Updated tests to new internal testing bundle/fs package path/name. |
| internal/testing/bundle/fs/bundlefs.go | Renamed package to fs (new location/namespace for bundle fs builder). |
| internal/testing/bundle/csv/builder_test.go | Updated tests to new internal testing bundle/csv package path/name. |
| internal/testing/bundle/csv/builder.go | Renamed package to csv (new location/namespace for CSV builder). |
| internal/operator-controller/rukpak/render/render_test.go | Updated imports to use new internal testing bundle/csv builder. |
| internal/operator-controller/rukpak/render/registryv1/validators/validator_test.go | Updated imports to use new internal testing bundle/csv builder. |
| internal/operator-controller/rukpak/render/registryv1/registryv1_test.go | Updated imports to use new internal testing bundle/csv builder. |
| internal/operator-controller/rukpak/render/registryv1/generators/generators_test.go | Updated imports to use new internal testing bundle/csv builder. |
| internal/operator-controller/rukpak/bundle/source/source_test.go | Updated imports to use new internal testing bundle/fs and bundle/csv builders. |
| internal/operator-controller/config/error_formatting_test.go | Updated imports to use new internal testing bundle/csv builder. |
| internal/operator-controller/config/config_test.go | Updated imports to use new internal testing bundle/csv builder. |
| internal/operator-controller/applier/provider_test.go | Updated imports to use new internal testing bundle/fs and bundle/csv builders. |
| internal/operator-controller/applier/boxcutter_test.go | Updated imports to use new internal testing bundle/fs and bundle/csv builders. |
| docs/superpowers/specs/2026-04-13-e2e-test-isolation-design.md | Added design spec documenting the per-scenario dynamic catalog architecture. |
| Makefile | Removed legacy image-registry target; updated env exports and unit-test dirs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5077134 to
6597bfd
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2651 +/- ##
==========================================
- Coverage 68.89% 67.98% -0.92%
==========================================
Files 141 144 +3
Lines 10009 10574 +565
==========================================
+ Hits 6896 7189 +293
- Misses 2596 2866 +270
- Partials 517 519 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6597bfd to
9d85b1b
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 77 out of 79 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Have you been able to check that this setup will also work in the downstream environment? |
|
@dtfranz That's a good point. I have not checked that. I'll make a PR downstream as well to see if I broke anything there. |
| @@ -0,0 +1,103 @@ | |||
| # E2E Test Isolation: Per-Scenario Catalogs via Dynamic OCI Image Building | |||
There was a problem hiding this comment.
@joelanford should we get it merged?
It might be nice have the design of the tests but should this dir be the correct one/
I think it was generated by CLAUDE to address the need.
There was a problem hiding this comment.
Do we have a directory for this kind of thing? I do think we should merge this kind of doc, and I was definitely intending to have this exact discussion.
There was a problem hiding this comment.
If we want define how our e2e test is design I would put that in the Contributing.MD
If you want to track design decisions then we need a new dir, so far all is in the google drive.
| ``` | ||
| Scenario starts | ||
| -> Generate parameterized bundle manifests (CRD names, deployments, etc. include scenario ID) | ||
| -> Build + push bundle OCI images to e2e registry via go-containerregistry |
There was a problem hiding this comment.
I've got this PR updated to keep around the essentials needed to avoid (hopefully) avoid breaking downstream. I've also got a new downstream PR that contains no downstream changes (just cherry-pick from upstream, resolve conflicts, and vendor): openshift/operator-framework-operator-controller#700
There was a problem hiding this comment.
Overall, I’m okay with the proposed changes.
We should sync with @tmshort to confirm whether there’s any reason this might not work downstream. Additionally, I suggest holding off on merging this until we can coordinate an upstream/downstream sync. That way, we can have a dedicated PR downstream containing only this change, which will make it easier to isolate and address any issues if they arise. PS> I saw either: openshift/operator-framework-operator-controller#699 to evaluate it in downstream which is a great idea 🚀
It also looks like the following doc may have been added unintentionally and should be reviewed/removed if not needed:
docs/superpowers/specs/2026-04-13-e2e-test-isolation-design.md
Lastly, looping in @pedjak for review as well, since this area is now primarily owned by him too.
9d85b1b to
cb8d120
Compare
|
I've got a downstream equivalent PR here to make sure I don't break downstream: openshift/operator-framework-operator-controller#699 This commit has the downstream changes that are required: openshift/operator-framework-operator-controller@1964d6b We can probably step this in by:
|
cb8d120 to
1b1ef51
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 82 out of 85 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1b1ef51 to
742aae9
Compare
742aae9 to
b7a6f66
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 82 out of 85 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -205,5 +211,6 @@ func ScenarioCleanup(ctx context.Context, _ *godog.Scenario, err error) (context | |||
| } | |||
| }(r) | |||
| } | |||
| wg.Wait() | |||
There was a problem hiding this comment.
ScenarioCleanup spawns one goroutine (and one external kubectl process) per resource deletion, with no concurrency limit. With high scenario concurrency this can create a large burst of parallel kubectl calls and lead to throttling/flakiness. Consider deleting sequentially or using a bounded worker pool/semaphore for deletions.
There was a problem hiding this comment.
Yeah, this is a potential concern. It doesn't seem to be affecting tests right now. So maybe not worth worrying about?
One possible improvement:
- A single goroutine for the entire tests process that handles deletions via requests from a channel. This goroutine can decide how often and how parallelized deletion requests are for the entire process.
- ScenarioCleanup sends deletion requests to the shared channel
b7a6f66 to
e75dbda
Compare
Thank you for considering downstream. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 82 out of 85 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e75dbda to
cb541fa
Compare
|
/label go-apidiff-override The removed functions ( |
|
@joelanford: The label(s) `/label go-apidiff-override
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 81 out of 84 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I think I'm good with this, if you wouldn't mind taking care of the Copilot comments (or explaining why they're not valid/needed)! |
1d053f1 to
43eed83
Compare
43eed83 to
968937d
Compare
|
Ok, I think I addressed all the Copilot comments. I did extend the E2E timeout. It seems like these changes (plus the addition of some new tests) push a full serial run of tests over the original 15m threshold. In a follow-up PR, we can separate parallel from serial jobs and push the e2e timeout back down. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 81 out of 84 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (2)
test/extension-developer-e2e/extension_developer_test.go:169
clusterExtensionis created with a fixed name ("registryv1"). If a previous run fails before cleanup (or tests are re-run against a reused cluster), this can cause create conflicts and make failures harder to diagnose. Consider usingGenerateNameor a randomized name (similar to the ServiceAccount/ClusterRole) and thread that name through the RBAC ResourceNames rule if needed.
test/extension-developer-e2e/extension_developer_test.go:299- This test creates a ServiceAccount, ClusterRole, and ClusterRoleBinding but only deletes the ClusterCatalog and ClusterExtension. Leaving the RBAC objects behind can clutter the cluster and make subsequent debugging harder (especially if the test fails before the final deletes). Consider registering
t.Cleanup(ordefer) to delete the ServiceAccount/ClusterRole/ClusterRoleBinding as well.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the localhost:30000 hostPort-based registry access with Kubernetes port-forwarding. This makes the test runner work regardless of the cluster network topology (not just kind with extraPortMappings). - Remove port 30000 extraPortMappings from kind configs - Add PortForward() to the registry package using SPDY - Use port-forward in e2e steps and extension-developer-e2e - Remove LOCAL_REGISTRY_HOST env var (no longer needed) - Keep NodePort service for containerd on kind nodes (hosts.toml)
968937d to
7edbcf3
Compare
rashmigottipati
left a comment
There was a problem hiding this comment.
Changes look good to me.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rashmigottipati The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Description
Replace shared, static test catalogs with dynamically generated per-scenario catalogs and bundles. Each scenario now builds and pushes its own OCI images at runtime, parameterized by scenario ID, making all cluster-scoped resource names unique. This eliminates cascading failures and enables future parallel test execution (~1,950 lines deleted net).
Commit 1 — Design spec: Add design spec documenting the test isolation architecture, builder API, feature file conventions, and naming scheme.
Commit 2 — Move test helpers: Relocate
bundlefsandcsvbuilders frominternal/operator-framework/rukpak/util/testing/tointernal/testing/bundle/.Commit 3 — Add new libraries: Composable catalog/bundle builder API (
test/internal/catalog/) usinggo-containerregistryand Go-based registry deployment (test/internal/registry/). Addstest/internal/...to unit test coverage.Commit 4 — Convert all tests: Update all feature files, step definitions, hooks, and README to use dynamic per-scenario catalogs with
${PACKAGE:...}and${CATALOG:...}variable substitution. Removeimage-registrydependency fromtest-e2eandtest-experimental-e2etargets.Commit 5 — Extension-developer-e2e: Add
TestMainto deploy registry and runsetup.shfrom Go. Constants in Go are the single source of truth for catalog tag and package name (passed tosetup.shvia env vars). Clean up Makefile exports.Commit 6 — Remove old infrastructure: Delete static bundles, catalogs, templates, push tooling, dead code (
test/helpers/,testdata/build-test-registry.sh), and theimage-registryMakefile target. Stubtestdata/build-test-registry.shandtestdata/push/push.gofor downstream CI compatibility.Commit 7 — Port-forward for registry access: Replace localhost:30000 hostPort-based registry access with Kubernetes port-forwarding via SPDY. Remove NodePort service (now ClusterIP), kind
extraPortMappingsfor port 30000, containerdhosts.toml, andLOCAL_REGISTRY_HOSTenv var. In extension-developer-e2e, images are exported from the container runtime and pushed via crane through the port-forward, avoiding Docker daemon network context issues.Commit 8 — Bump e2e timeout: Increase experimental e2e timeout to 20m to accommodate dynamic registry construction (and separate from this PR, the repo's increasing number of tests).
Reviewer Checklist
Generated with Claude Code