8000 CNBi conditions are not declarative · Issue #130 · thoth-station/meteor-operator · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

CNBi conditions are not declarative #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task
VannTen opened this issue Nov 14, 2022 · 5 comments
Open
1 task

CNBi conditions are not declarative #130

VannTen opened this issue Nov 14, 2022 · 5 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. wg/cre Issues or PRs related to the Custom Runtime Environment (fka Custom Notebook Image) ODH feature.
Milestone

Comments

@VannTen
Copy link
Member
VannTen commented Nov 14, 2022

Problem statement

Many of the conditions defined in api/v1alpha/cnbi_conditions_consts.go describe state changes instead of state.
This goes against the recommendations in the kubernetes API conventions relative to conditions, particularly:

Condition type names should describe the current observed state of the resource, rather than describing the current state transitions.

Take a look at controllers/cnbi/controller.go:

meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.ErrorPipelineRunCreate,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunCreateFailed",
    Message: err.Error(),
})
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.PipelineRunCreated,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunCreated",
    Message: fmt.Sprintf("%s PipelineRun created successfully", pipelineRun.Name),
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.GenericPipelineError,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunGenericError",
    Message: err.Error()},
)
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.PipelineRunCompleted,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunCompleted",
    Message: "The PipelineRun has been completed successfully.",
})
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.ImageImportReady,
    Status:  metav1.ConditionTrue,
    Reason:  "ImageImportReady",
    Message: "Import succeeded, the image is ready to be used.",
})
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.PipelineRunCompleted,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunCompleted",
    Message: "The PipelineRun has been completed with a failure!",
})
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.ImageImportReady,
    Status:  metav1.ConditionFalse,
    Reason:  "ImageImportNotReady",
    Message: "Import failed, this could be due to the repository to import from does not exist or is not accessible",
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.ErrorBuildingImage,
    Status:  metav1.ConditionTrue,
    Reason:  "ErrorBuildingImage",
    Message: "Build failed!",

(this is the code in #129, but it only changes the syntax, not the conditions name or the reasons)
For nearly all the conditions, we use a Reason which is the same that the condition type; I think our Reasons are fine (describe state transitions), but our conditions types are not.

High-level Goals

Be more Kubernetes-like when implementing the Conditions.

Proposal description

List of possible alternative conditions:

BaseImageAvailable
ImageAvailable
ImageIsValid
ThothAdviseScheduled (see K8s API conventions on long state transitions)
ThothAdviseAvailable

Additional context

#126 is relevant; tackling the too together might be a good idea.
Consider in particular the following scenario:

  1. A CNBi has been created, the ImageStream has been created and the image built and pushed.
  2. The ImageStream get deleted. We should propagate that change, and none of the existing conditions fits. (we should also re-launch a pipeline to reconcile the state).

Acceptance Criteria

  • The cnbi controller use declarative conditions, with Reason used as state transitions indicators

References

/wg cnbi
/priority important-longterm
/group-programming

@VannTen VannTen added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 14, 2022
@sesheta sesheta added wg/cnbi priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. wg/cre Issues or PRs related to the Custom Runtime Environment (fka Custom Notebook Image) ODH feature. and removed wg/cnbi labels Nov 14, 2022
@goern goern added this to the v0.2.0 milestone Nov 30, 2022
@codificat
Copy link
Member

Related: #103

@codificat
Copy link
Member

A requirement here is to make sure that the odh-dashboard works with the changes made to the conditions

@codificat codificat moved this from 🆕 New to 🔖 Next in Planning Board Jan 9, 2023
@VannTen
Copy link
Member Author
VannTen commented Jan 10, 2023

One point what's not clear to me and can impact the way we design the set of conditions:

What should be the relation to the Shower/Meteor resources ? If I'm correct, with CRE we handle the build part, and the run part should stay in those resources ?
Or should we handle those as "subresources" and hides them behind the CRE ?

In particular, I think this should impact whether or not we have a Running condition for in the CRE status.

wdyt @goern @codificat ? (this might be a good point to discuss on CNBi meeting Wednesday.

@goern
Copy link
Member
goern commented Jan 10, 2023

Meteor CRD & friend should stay for compatibility, cre does not need to take control over them, if the meteor build can be handled by CRE it could (prio--)

From my pov meteor and cre is pretty disjoint and decoupled

Does this help? In line with other minds?

@VannTen
Copy link
Member Author
VannTen commented Jan 10, 2023 via email

@VannTen VannTen moved this from 🔖 Next to 🏗 In progress in Planning Board Jan 17, 2023
@goern goern added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. wg/cre Issues or PRs related to the Custom Runtime Environment (fka Custom Notebook Image) ODH feature.
Projects
Status: 🏗 In progress
Development

Successfully merging a pull request may close this issue.

4 participants
0