-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🌱 Priorities machine with remediate-machine anotation when selecting the next machine to be remediated #11495
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR is currently missing an area label, which is used to identify the modified component when generating release notes. Area labels can be added by org members by writing Please see the labels list for possible areas. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Karthik-K-N thanks for taking up this work!
Let me know if you want some additional clarification
Thanks for the clarification, I will update it accordingly. Thanks |
9ae8c46
to
849afff
Compare
Updated with changes for KCP as suggested, Will do for remaining soon. |
Made changes for machine set controller as well, Please take a look when you get some time. Thanks. |
can we update the annotation description as well? current:
I think it should be:
|
Updated. Thank you. |
// Returns the machines to be remediated in the following order | ||
// Machines with RemediateMachineAnnotation annotation if any, | ||
// Machines failing to come up first because | ||
// there is a chance that they are not hosting any workloads (minimize disruption). | ||
func getMachinesToRemediateInOrder(remediateMachines []*clusterv1.Machine) []*clusterv1.Machine { | ||
// From machines to remediate select the machines with RemediateMachineAnnotation annotation. | ||
annotatedMachines := collections.FromMachines(remediateMachines...).Filter(collections.HasAnnotationKey(clusterv1.RemediateMachineAnnotation)) | ||
|
||
// Filter out the machines which are unique (machines which are not in annotated machines list) | ||
uniqueMachines := collections.FromMachines(remediateMachines...).Difference(annotatedMachines).UnsortedList() | ||
|
||
// Sort the machines from newest to oldest. | ||
sort.SliceStable(uniqueMachines, func(i, j int) bool { | ||
return uniqueMachines[i].CreationTimestamp.After(uniqueMachines[j].CreationTimestamp.Time) | ||
}) | ||
|
||
// combine the annotated machines along with machines to remediate. | ||
return append(annotatedMachines.UnsortedList(), uniqueMachines...) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this alternative implementation
// Returns the machines to be remediated in the following order | |
// Machines with RemediateMachineAnnotation annotation if any, | |
// Machines failing to come up first because | |
// there is a chance that they are not hosting any workloads (minimize disruption). | |
func getMachinesToRemediateInOrder(remediateMachines []*clusterv1.Machine) []*clusterv1.Machine { | |
// From machines to remediate select the machines with RemediateMachineAnnotation annotation. | |
annotatedMachines := collections.FromMachines(remediateMachines...).Filter(collections.HasAnnotationKey(clusterv1.RemediateMachineAnnotation)) | |
// Filter out the machines which are unique (machines which are not in annotated machines list) | |
uniqueMachines := collections.FromMachines(remediateMachines...).Difference(annotatedMachines).UnsortedList() | |
// Sort the machines from newest to oldest. | |
sort.SliceStable(uniqueMachines, func(i, j int) bool { | |
return uniqueMachines[i].CreationTimestamp.After(uniqueMachines[j].CreationTimestamp.Time) | |
}) | |
// combine the annotated machines along with machines to remediate. | |
return append(annotatedMachines.UnsortedList(), uniqueMachines...) | |
} | |
// Returns the machines to be remediated in the following order | |
// - Machines with RemediateMachineAnnotation annotation if any, | |
// - Machines failing to come up first because | |
// there is a chance that they are not hosting any workloads (minimize disruption). | |
func sortMachinesToRemediate(machines []*clusterv1.Machine) { | |
sort.SliceStable(machines, func(i, j int) bool { | |
_, iHasRemediateAnnotation := machines[i].Annotations[clusterv1.RemediateMachineAnnotation] | |
_, jHasRemediateAnnotation := machines[j].Annotations[clusterv1.RemediateMachineAnnotation] | |
if iHasRemediateAnnotation && !jHasRemediateAnnotation { | |
return true | |
} | |
if !iHasRemediateAnnotation && jHasRemediateAnnotation { | |
return false | |
} | |
return machines[i].CreationTimestamp.After(machines[j].CreationTimestamp.Time) | |
}) | |
} |
(trying to have something easier to read/that sticks to the fact that this is a sorting func)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah makes sense, I think I took the complicated way, I just updated as suggested but still kept in a seperate function to easily test, Let me know if I need to make it inplace and enhance the existing test itself.
@@ -1334,10 +1334,10 @@ func (r *Reconciler) reconcileUnhealthyMachines(ctx context.Context, s *scope) ( | |||
|
|||
// Calculates the Machines to be remediated. | |||
// Note: Machines already deleting are not included, there is no need to trigger remediation for them again. | |||
machinesToRemediate := collections.FromMachines(machines...).Filter(collections.IsUnhealthyAndOwnerRemediated, collections.Not(collections.HasDeletionTimestamp)).UnsortedList() | |||
remediateMachines := collections.FromMachines(machines...).Filter(collections.IsUnhealthyAndOwnerRemediated, collections.Not(collections.HasDeletionTimestamp)).UnsortedList() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, I would keep the existing name, it seems more clear to me
What this PR does / why we need it:
This PR contains the changes to priorities the machine with
cluster.x-k8s.io/remediate-machine
annotationWhich issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #11385