WIP: OpenShift Tests Extension Framework Initial #1676

jupierce · 2024-09-05T19:26:47Z

No description provided.

openshift-ci · 2024-09-05T19:28:26Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jupierce. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

enhancements/testing/openshift-tests-extension.md

stbenjam · 2024-09-11T15:40:15Z

enhancements/testing/openshift-tests-extension.md

+##### OpenShift Payload Extension Binaries
+For OpenShift payload components contributors can advertise the existence of an extension binary
+by adding information (the imagestream tag for the OCP payload component and the path to the binary 
+within their image) to a simple registry datastructure in github.com/openshift/origin. 


Alternative: could we store this info in the release image? (The kind that shows up in oc adm release info -ojson).

This would let us register things more dynamically, and also supply more metadata like which suites it has. Maybe get rid of the info subcommand.

maybe; would require updates to oc adm release new

enhancements/testing/openshift-tests-extension.md

sosiouxme

mostly typos, maybe a few bits of substance

enhancements/testing/openshift-tests-extension.md

dgoodwin

Just a few thoughts, looks quite exciting.

dgoodwin · 2024-09-20T14:22:59Z

enhancements/testing/openshift-tests-extension.md

+
+Component authors may choose to reduce the number of tests run for non-default
+configuration profiles, focusing only on tests likeliest to fail based on the 
+configuration change, in order to reduce overall execution time.


Could we get an example on this one, I couldn't come up with a use case immediately.

Let's say you want to test a more verbose logging level configuration option in your CRD. You may have hundreds of tests that fully exercise your component in the 'default' configuration, but it is overkill to run them all again simply to verify that debug log statements are being emitted by your pod when verbose logging is enabled.
So you expose an extension configuration for the verbose logging level, and output only one test when asked for a list in that configuration. All that test does is read the component's pod logs and makes sure that it sees debug output.

If you expose a configuration that disables http/1, you might just want to run a test that verifies an http/1 connection is rejected and an http/2 connection is accepted. If you want to test branding, you might just want to verify that the HTML you scrape contains the newly configured name. If you expose a threshold, you might just want to test that a single expected alert is firing after configuring it.

I can add these to the doc if you buy the premise.

dgoodwin · 2024-09-20T14:24:09Z

enhancements/testing/openshift-tests-extension.md

+                # If applying the configuration implies a disruption, inform 
+                # openshift-tests, so that it can be accounted for in overall
+                # disruption reporting.
+                "disruption": "1m",


This makes me a little nervous if it's happening in other repos and we didn't have good visibility into people abusing this because a problem popped up. Probably not a core concern for this enhancement though.

dgoodwin · 2024-09-20T14:26:06Z

enhancements/testing/openshift-tests-extension.md

+  # testing logic. This allows component readiness
+  # to display the human-readable version of the test
+  # name while considering test runs across name changes.
+  "originalName": "security version compliance",


I really like baking this into the component repos, might get us more renames as very very few people go to the mapping repo.

It would be neat if something could comment on PRs where we see added+removed tests that "if this was a rename, please do ...". That would be relatively often a rename. Problem for another day though.

dgoodwin · 2024-09-20T14:26:49Z

enhancements/testing/openshift-tests-extension.md

+            # before the test was run.
+            "component": "default",
+        }
+    },


Environment is a little unusual in the results for every test, that seems like mostly a characteristic of a job and a lot of duplication?

There are fixed aspects of the environment, like GCP vs AWS, but the enhancement suggests that environment include component configuration information -- a single job being able to apply and test multiple different configuration options. Imagine an operator being able to cycle through several of its typical configurations, running the same tests or tests specific to those configurations, during the execution of a single job. That configuration is relevant to the outcome of a test and Component Readiness must be able to differentiate the same test name running in one configuration vs another.

A next question would be why we would store those static environmental aspects in the aggregated results file alongside each test. My hope there is that the results files can begin to stand alone. You can just push the file content into a database and you know everything you need to know from the resulting DB. You don't need parse prowjob job names, for example, to derive additional context about how the test was run. Many tools can ingest a comprehensive file like this directly, so our options for analysis expand. Imagine wanting to move to a new database or use local tooling to analyze the data. With a comprehensive file format, we just ingest the file into our target analysis tool -- no custom logic like what we have in the cloud function required to pull bits and pieces from multiple artifacts.

enhancements/testing/openshift-tests-extension.md

stbenjam · 2024-10-01T23:36:39Z

enhancements/testing/openshift-tests-extension.md

+Note that `run-test` will blindly execute tests in the list as quickly as possible,
+in parallel, without consideration for system resources or parallelism constraints


Just a note, OTE doesn't do parallel execution yet. There's some oddities about how we invoke ginkgo tests today. I think I just haven't found all the things I need mutexes for yet.

I assume that's why origin shells out to execute every test

You can resolve this one, OTE executes in parallel. Ginkgo has a mutex to force serial execution but other frameworks would be parallelized.

enhancements/testing/openshift-tests-extension.md

Update the doc based on my comments + actual current implementation

stbenjam

Just a couple of comments. The changes to details are good

enhancements/testing/openshift-tests-extension.md

stbenjam · 2024-10-14T18:50:44Z

enhancements/testing/openshift-tests-extension.md

+    # If a test name is updated at any time in the future,
+    # originalName must report the original name of the 
+    # testing logic. This allows component readiness
+    # to display the human-readable version of the test
+    # name while considering test runs across name changes.
+    "originalName": "security version compliance",


In the current version I have this as otherNames so we could have the full history of a test's name.

originalName could work, but it must be included in the ExtensionTestResult, and have this data make its way to a column in the junit table. We'd then need to update ci-test-mapping to look at this column when considering the test ID, which should work fine for both the old way (a rename map in the ci-test-mapping repo) and the extension way (stable original name).

I wouldn't expect anyone including component readiness to group by otherNames (or even originalName), but rather on the test ID from a join on the ci-test-mapping table. We need to be backwards compatible with the universe today, and previous openshift releases without extension test binaries, which means continuing to use the mapping table.

enhancements/testing/openshift-tests-extension.md

Bigquery requires UNNEST for arrays/repeated records. By storing environment as a JSON object, we can use JSON_EXTRACT_SCALAR efficiently on maps with unique keys.

openshift-ci · 2024-10-25T18:31:45Z

@jupierce: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/markdownlint	`c2c0d43`	link	true	`/test markdownlint`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-bot · 2024-11-23T01:15:48Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2024-11-30T08:45:57Z

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

stbenjam · 2024-11-30T18:22:13Z

/remove-lifecycle rotten

openshift-bot · 2024-12-29T01:15:18Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2025-01-05T08:45:33Z

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

jsafrane · 2025-01-09T09:40:06Z

enhancements/testing/openshift-tests-extension.md

+}
+```
+
+### Risks and Mitigations


With all tests in openshift/origin, we need to bump k8s.io/kubernetes with many useful e2e functions only there. When we move the tests to individual repos, all these repos will endure pain with updating k8s.io/kubernetes. Upstream does not even pretend it has a stable API and it often breaks.

Alternatively, we would need to spend some non-trivial time rewriting the tests not to depend on k8s.io/kubernetes/test/e2e/framework.

Would it be helpful to have a discussion about your particular use case? I might be missing some context. What e2e functions are you using from k8s.io/kubernetes/test/e2e/framework? Which tests are you looking to move to external binaries for which repos?

Our next OTE customer after migrating k8s-tests-ext is ovn-kubernetes, and the ginkgo tests already exist in their repos to run them. We'd mostly be running them unmodified from upstream

The other use case we looked at was moving QE's openshift-tests-private to the component repos, and those don't use anything from k8s.io/kubernetes/test/e2e/framework.

jsafrane · 2025-01-10T09:39:51Z

enhancements/testing/openshift-tests-extension.md

+Optional Operator authors must ensure that the image carrying the extension binary
+is identified in their ClusterServiceVersion (CSV) so that tools like `oc-mirror`
+will copy image(s) bearing extension binaries to disconnected clusters.


We also must ensure that all images used by the extension e2e tests are mirrored by ./openshift-tests images --upstream --to-repository=xyz.

I could be wrong here, I think this list of images seems to be currently hardwired in openshift-tests binary during build. There is already some wording about adding new images to the list here, it will need to be either much stricter or we would need to get the list of images from extension binaries too.

Justin and I discussed this at one point, but I guess the outcome didn't end up in the enhancement. I think the outcome of that was new images would just need a separate PR to origin to add it. It's low frequency enough that it shouldn't be too disruptive. Most tests just use one of a handful of images (agnhost, tools, cli).

We need to solve the helper problem, though.

jsafrane · 2025-01-10T09:41:58Z

enhancements/testing/openshift-tests-extension.md

+Optional Operator authors must ensure that the image carrying the extension binary
+is identified in their ClusterServiceVersion (CSV) so that tools like `oc-mirror`
+will copy image(s) bearing extension binaries to disconnected clusters.


(starting a separate thread). To get the name of a mirrored image, a test is suggested to call github.com/openshift/origin/test/extended/util/image.LocationFor("my.source/image/location:versioned_tag") here. Does the extension need to import openshift/origin? That could be problematic, especially if the extension imports incompatible version of k8s.io/kubernetes.

This is something we'll need to solve for sure. We want to minimize vendoring requirements in the component repos for sure. I do not want anyone to have to vendor origin code.

Perhaps we could pass the image locations to each tests binary and have OTE provide a LocationFor helper. It could be passed either as a CLI flag or a path to a JSON/YAML file.

jupierce · 2025-01-13T18:22:34Z

/remove-lifecycle rotten

mtulio · 2025-01-15T14:46:36Z

enhancements/testing/openshift-tests-extension.md

+colocated with the features they are testing. It defines a standardized interface
+for test discovery, execution, and result aggregation, allowing decentralized
+contributions while maintaining centralized orchestration.


allowing decentralized contributions while maintaining centralized orchestration.

Is there any impact in how the new tests are added to the conformance suite, such as openshift/conformance?

OpenShift Tests Extension Framework Initial

610e5b5

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 5, 2024

openshift-ci bot requested review from pavolloffay and stbenjam September 5, 2024 19:28

stbenjam reviewed Sep 11, 2024

View reviewed changes

Notes from Forrest

4207d34

sosiouxme reviewed Sep 16, 2024

View reviewed changes

stbenjam reviewed Sep 16, 2024

View reviewed changes

enhancements/testing/openshift-tests-extension.md Show resolved Hide resolved

Stephen & Luke feedback

ec8afda

dgoodwin reviewed Sep 20, 2024

View reviewed changes

Adding update verb for presubmit checks

0f434ac

stbenjam reviewed Sep 23, 2024

View reviewed changes

enhancements/testing/openshift-tests-extension.md Show resolved Hide resolved

stbenjam mentioned this pull request Sep 24, 2024

TRT-1834: initial version of openshift-tests-extension openshift-eng/openshift-tests-extension#2

Merged

stbenjam reviewed Sep 24, 2024

View reviewed changes

enhancements/testing/openshift-tests-extension.md Outdated Show resolved Hide resolved

stbenjam reviewed Sep 27, 2024

View reviewed changes

enhancements/testing/openshift-tests-extension.md Outdated Show resolved Hide resolved

stbenjam reviewed Sep 27, 2024

View reviewed changes

enhancements/testing/openshift-tests-extension.md Outdated Show resolved Hide resolved

Add pseudocode for extension implementation

d4186a8

stbenjam reviewed Oct 1, 2024

View reviewed changes

stbenjam reviewed Oct 3, 2024

View reviewed changes

enhancements/testing/openshift-tests-extension.md Outdated Show resolved Hide resolved

jupierce force-pushed the openshift-tests-extension branch 2 times, most recently from 402509b to cdb1d75 Compare October 4, 2024 13:54

Version tracking metadata

ac860f4

jupierce force-pushed the openshift-tests-extension branch from cdb1d75 to ac860f4 Compare October 4, 2024 13:56

stbenjam mentioned this pull request Oct 7, 2024

TRT-1835: Add an openshift-tests-extension compatible test binary openshift/kubernetes#2105

Merged

stbenjam and others added 4 commits October 9, 2024 19:14

Update the doc based on my comments + actual current implementation

d76ee03

Merge pull request #1 from stbenjam/openshift-tests-extension

5969b7e

Update the doc based on my comments + actual current implementation

Add diagnostic collection & triage

a3c6831

Add run-monitor

cb21bec

stbenjam reviewed Oct 14, 2024

View reviewed changes

enhancements/testing/openshift-tests-extension.md Outdated Show resolved Hide resolved

enhancements/testing/openshift-tests-extension.md Outdated Show resolved Hide resolved

enhancements/testing/openshift-tests-extension.md Outdated Show resolved Hide resolved

stbenjam reviewed Oct 14, 2024

View reviewed changes

enhancements/testing/openshift-tests-extension.md Outdated Show resolved Hide resolved

jupierce added 2 commits October 14, 2024 16:50

Reorganize monitor information

2f00631

Add facts and change to arrays to maps

cc85139

Bigquery requires UNNEST for arrays/repeated records. By storing environment as a JSON object, we can use JSON_EXTRACT_SCALAR efficiently on maps with unique keys.

jupierce force-pushed the openshift-tests-extension branch from 76ccbf2 to cc85139 Compare October 25, 2024 17:36

Extensions cache dir for local dev

c2c0d43

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 23, 2024

dgoodwin mentioned this pull request Nov 25, 2024

WIP: Improve results and debuggability for quay-e2e job openshift/release#59126

Open

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 30, 2024

openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Nov 30, 2024

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 29, 2024

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 5, 2025

jsafrane reviewed Jan 9, 2025

View reviewed changes

jsafrane reviewed Jan 10, 2025

View reviewed changes

openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 13, 2025

mtulio reviewed Jan 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: OpenShift Tests Extension Framework Initial #1676

WIP: OpenShift Tests Extension Framework Initial #1676

jupierce commented Sep 5, 2024

openshift-ci bot commented Sep 5, 2024

stbenjam Sep 11, 2024

sosiouxme Sep 16, 2024

sosiouxme left a comment

dgoodwin left a comment

dgoodwin Sep 20, 2024

jupierce Sep 24, 2024

dgoodwin Sep 20, 2024

dgoodwin Sep 20, 2024

dgoodwin Sep 20, 2024

jupierce Sep 25, 2024

stbenjam Oct 1, 2024 •

edited

Loading

stbenjam Oct 15, 2024

stbenjam left a comment

stbenjam Oct 14, 2024 •

edited

Loading

openshift-ci bot commented Oct 25, 2024

openshift-bot commented Nov 23, 2024

openshift-bot commented Nov 30, 2024

stbenjam commented Nov 30, 2024

openshift-bot commented Dec 29, 2024

openshift-bot commented Jan 5, 2025

jsafrane Jan 9, 2025

stbenjam Jan 10, 2025

jsafrane Jan 10, 2025

stbenjam Jan 10, 2025 •

edited

Loading

jsafrane Jan 10, 2025

stbenjam Jan 10, 2025

jupierce commented Jan 13, 2025

mtulio Jan 15, 2025

		Note that `run-test` will blindly execute tests in the list as quickly as possible,
		in parallel, without consideration for system resources or parallelism constraints

WIP: OpenShift Tests Extension Framework Initial #1676

Are you sure you want to change the base?

WIP: OpenShift Tests Extension Framework Initial #1676

Conversation

jupierce commented Sep 5, 2024

openshift-ci bot commented Sep 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sosiouxme left a comment

Choose a reason for hiding this comment

dgoodwin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stbenjam Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stbenjam left a comment

Choose a reason for hiding this comment

stbenjam Oct 14, 2024 • edited Loading

Choose a reason for hiding this comment

openshift-ci bot commented Oct 25, 2024

openshift-bot commented Nov 23, 2024

openshift-bot commented Nov 30, 2024

stbenjam commented Nov 30, 2024

openshift-bot commented Dec 29, 2024

openshift-bot commented Jan 5, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stbenjam Jan 10, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jupierce commented Jan 13, 2025

Choose a reason for hiding this comment

stbenjam Oct 1, 2024 •

edited

Loading

stbenjam Oct 14, 2024 •

edited

Loading

stbenjam Jan 10, 2025 •

edited

Loading