-
Notifications
You must be signed in to change notification settings - Fork 1.6k
KEP-3619: Add SupplementalGroupsPolicy feature fields in Kubernetes API(Node.Status) and CRI(RuntimeStatusResponse)
#4728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9e2e531 to
18e9255
Compare
18e9255 to
bd92978
Compare
a998bd3 to
da3c536
Compare
…nd CRI(RuntimeFeatures) so that kubernetes can capture the implemented features of the underlying CRI implementation.
da3c536 to
2fd44b8
Compare
| } | ||
| ``` | ||
|
|
||
| Recently [KEP-3857: Recursive Read-only (RRO) mounts](https://kep.k8s.io/3857) introduced `RuntimeHandlers[].Features`. But this does not fit to use for this KEP because RRO mounts should require to inspect [the OCI runtime spec's Feature](https://github.com/opencontainers/runtime-spec/blob/main/features.md) to understand low-level OCI runtime supports RRO or not. However, for this KEP(SupplementalGroupsPolicy), it does not need to inspect [the OCI runtime spec's Feature](https://github.com/opencontainers/runtime-spec/blob/main/features.md) because this KEP only affects to [`Process.User.additionalGid`](https://github.com/opencontainers/runtime-spec/blob/main/config.md#user) and this does not depend on [the OCI runtime spec's Feature](https://github.com/opencontainers/runtime-spec/blob/main/features.md). So, introducing new `RuntimeFeatures` in `NodeStatus` does not make any confusion with `RuntimeHandlerFeatures` because we can clearly define how to use them as below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it very odd that we would call this "RuntimeFeatures" and then explicitly scope it NOT to include features of the runtime.
What I mean is that the distinction you're making between "runtime" and "runtime handler" is very technical and not helpful or interesting to most people. A distinction without a meaningful difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A distinction without a meaningful difference?
Yeah, I was a bit worried about how complicated it might get, even though I went along with @haircommander's suggestion. But I’ve started thinking about adding SupplementalGroupsPolicy to the existing RuntimeHandlerFeatures like I originally suggested, because in the end, every container is going to be instantiated by OCI runtimes anyway. So, whether it’s a CRI feature or an OCI runtime feature, I think adding these feature fields to RuntimeHandlerFeatures makes things simpler and easier to understand?
@haircommander WDYT?? 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my issue with it is supplimental groups have been supported by OCI runtime since their inception whereas runtime features were added to check whether an oci runtime has a feature available. It would feel weird to plumb the oci runtimes with supplimental group support to their runtime features call because they've basically always supported it.
really the problem is "runtime" was chosen for CRI and "runtime handler" for lower level runtimes, causing this confusion. Could NodeFeature work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect that, over time, these features will reach 100%, or that they will forever be optional? Is there ever a time when we can remove the code that is conditional on these feature indicators, or do we keep them forever? Can they be treated like feature gates - after some point in time they don't need to be expressed, they are assumed?
supplimental groups have been supported by OCI runtime since their inception
Supplemental groups have, but this policy has not, right? Do we think that EVERY runtime handler will properly handle this?
The issues I have here are 1) naming clarity; 2) do we really need 2 different sets of feature indicators?
Taking 2 first, I forgot that NodeRuntimeHandlerFeatures is already in a list-map keyed by handler name, meaning the list of features could be different for different handlers. So while we could take the flags from the runtime and then "augment" that with flags from kubelet, it feels awkward.
I guess I can buy that we really do have 2 lists. So then to my point 1 - if we can name this so it is clear, that would be a relief. As someone who might write a client, I need some way to think about which list I should be checking for which information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really the problem is "runtime" was chosen for CRI and "runtime handler" for lower level runtimes, causing this confusion. Could NodeFeature work?
Similar discussion occurred in this thread too. @AkihiroSuda also suggested similar idea.
The reason why I selected "runtime" is that CRI is abbreviation of Container Runtime Interface. CRI actually defines RuntimeService. Thus, I thought just calling runtime for CRI implementation can be ok.
Do we expect that, over time, these features will reach 100%,
I'd like to expect so.
So then to my point 1 - if we can name this so it is clear, that would be a relief.
For naming, how about NodeStatus.Features(type=NodeFeatures) as @AkihiroSuda suggested here, @haircommander also suggested in here?? Or, how about more direct naming like ContainerRuntimeFeatures, CRIFeatures??
if we can name this so it is clear, that would be a relief. As someone who might write a client, I need some way to think about which list I should be checking for which information.
Previously, I proposed this distinction in this thread. Is this too technical??
NodeFeatures(this kep): focuses on features that do NOT require to inspect OCI runtime spec's Feature (i.e. implemented only in high-level container runtimes)RuntimeHandlers[].Features(kep-3857): focuses on features that require to inspect OCI runtime spec's Feature (i.e. need implemented in low-level container runtimes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with making a distinction between "high-level container runtimes" and "low-level container runtimes" is that there are maybe 20 people in the world who know what that means, and they are all on this PR. End users and integrators will struggle to comprehend this. I struggle with it. Is containerd high-level or low-level? Is kubelet a "container runtime"?
Given a node with kubelet and containerd, which component will answer the question asked by this field? (specifically "Is SupplementalGroupsPolicy supported?")
My main purpose of this API change is ... to add E2E test (i.e. it skips the test when the CRI implementation does not implement this feature
Having e2es that self-disable is a good way to ensure that they are broken . It's useful, I guess, to let people opt out of them, but we need them to be enabled for our own CI, so that a failure is handled immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with making a distinction between "high-level container runtimes" and "low-level container runtimes" is that there are maybe 20 people in the world who know what that means, and they are all on this PR. End users and integrators will struggle to comprehend this. I struggle with it. Is containerd high-level or low-level? Is kubelet a "container runtime"?
Yes, totally.
Given a node with kubelet and containerd, which component will answer the question asked by this field? (specifically "Is SupplementalGroupsPolicy supported?")
As you might know, any single component can't answer this question. This is what makes API design difficult.
- kubelet: it must need to enable feature gate (until GA)
- containerd: it needs to let kubelet know
SupplementalGroupsPolicycontrol is implemented or not.- currently we have no such interfaces in CRI, either.
- this is why this PR also proposed to add a interface for this in CRI, too.
but we need them to be enabled for our own CI, so that a failure is handled immediately.
Do you mean we can add another CI test job for SupplementalGroupsPolicy or other CRI implementation dependent alpha features (e.g. RRO)?? I think it's a nice idea, thanks. I would like to research how to do it. I'm very novis in test-infra.
Summarised plans proposed so far are:
- Plan 1:
NodeFeatures(this kep): focuses on features that do NOT require to inspect OCI runtime spec's Feature (i.e. implemented only in high-level container runtimes) (other candates for naming:ContainerRuntimeFeatures,CRIFeatures)RuntimeHandlers[].Features(kep-3857): focuses on features that require to inspect OCI runtime spec's Feature (i.e. need implemented in low-level container runtimes)- this distinction may confuse many end-users
- Plan 2: just adding
RuntimeHandlers[].Features.SupplementalGroupsPolicy- thockin and haircommander is inclined against on this
Honestly speaking, I have no better idea than these, for now... What kind of API can be as simple and easy as possible for many end users??🤔 I would be very glad if anyone would propose any other idea. 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like plan 1 :) @thockin in your case, containerd would need to be aware. Kubelet is part of kube, and CRI begins the "container runtime" (I like to call them CRI implementations to distinguish between lower level runtimes). the lower level runtimes also need to have support for the field, but they have forever, it's the CRIs that don't yet and we'd want visibility into to have the kubelet trust it was set to report success
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with plan 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thockin @haircommander (cc/ @AkihiroSuda) Thank you, guys! I think we agreed to add NodeStatus.Features and to add SupplementalGroupsPolicy to it.
I renamed NodeRuntimeFeatures with NodeFeatures in 08fb7f7.
PTAL🙇
@SergeyKanzhelev @mrunalp Would you mind reviewing this PR as the KEP reviewers??
thockin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/lgtm
/approve
| message StatusResponse { | ||
| ... | ||
| // features describes the set of features implemented by the CRI implementation. | ||
| RuntimeFeatures features = ?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should this be NodeFeatures as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In CRI, I would like to use "Runtime" instead of "Node" because CRI does not use "Node" wording. WDYT??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure can you add a comment to the cri spec to say where the value is going?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haircommander fixed in e054a20
…e value goes to NodeFeatures in K8s API
|
/lgtm @mrunalp PTAL |
|
/assign @mrunalp let me assign you on this PR as the approver so that you can catch the PR easier🙇 |
|
Looking.. |
|
Left some nit but looks fine. Thanks! |
Co-authored-by: Mrunal Patel <[email protected]>
everpeace
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrunalp Thanks for fixing my typos🙇🙇 I applied all your suggestions. PTAL 🙇
mrunalp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: everpeace, mrunalp, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
SupplementalGroupsPolicyfeature fields in Kubernetes API(Node.Status) and CRI(RuntimeStatusResponse)NodeStatus.Features (type: NodeFeatures)and its corresponding field in CRI so that kubernetes can capture if the feature is implemented by the underlying CRI implementation in each node.SupplementalGroupsPolicylives underNodeFeaturesNodeStatus.RuntimeHandlers[].Features(which contains rro, user ns) recently introduced in KEP-3857(RRO)NodeStatus.Features(this kep): focuses on features that do NOT require to inspect OCI runtime spec's Feature (i.e. implemented only in high-level container runtimes)NodeStatus.RuntimeHandlers[].Features(kep-3857): focuses on features that require to inspect OCI runtime spec's Feature (i.e. need implemented in low-level container runtimes)