[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <887699ea-f837-6ed7-50bd-48720cea581c@gmail.com>
Date: Fri, 11 Aug 2023 11:17:43 -0700
From: Kui-Feng Lee <sinquersw@...il.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>,
David Vernet <void@...ifault.com>
Cc: bpf@...r.kernel.org, ast@...nel.org, daniel@...earbox.net,
andrii@...nel.org, song@...nel.org, yhs@...com,
john.fastabend@...il.com, kpsingh@...nel.org, haoluo@...gle.com,
jolsa@...nel.org, linux-kernel@...r.kernel.org,
kernel-team@...a.com, tj@...nel.org, clm@...a.com,
thinker.li@...il.com, Stanislav Fomichev <sdf@...gle.com>
Subject: Re: [PATCH bpf-next] bpf: Support default .validate() and .update()
behavior for struct_ops links
On 8/11/23 10:35, Martin KaFai Lau wrote:
> On 8/10/23 4:15 PM, Stanislav Fomichev wrote:
>> On 08/10, David Vernet wrote:
>>> On Thu, Aug 10, 2023 at 03:46:18PM -0700, Stanislav Fomichev wrote:
>>>> On 08/10, David Vernet wrote:
>>>>> Currently, if a struct_ops map is loaded with BPF_F_LINK, it must also
>>>>> define the .validate() and .update() callbacks in its corresponding
>>>>> struct bpf_struct_ops in the kernel. Enabling struct_ops link is
>>>>> useful
>>>>> in its own right to ensure that the map is unloaded if an application
>>>>> crashes. For example, with sched_ext, we want to automatically unload
>>>>> the host-wide scheduler if the application crashes. We would likely
>>>>> never support updating elements of a sched_ext struct_ops map, so we'd
>>>>> have to implement these callbacks showing that they _can't_ support
>>>>> element updates just to benefit from the basic lifetime management of
>>>>> struct_ops links.
>>>>>
>>>>> Let's enable struct_ops maps to work with BPF_F_LINK even if they
>>>>> haven't defined these callbacks, by assuming that a struct_ops map
>>>>> element cannot be updated by default.
>>>>
>>>> Any reason this is not part of sched_ext series? As you mention,
>>>> we don't seem to have such users in the three?
>>>
>>> Hi Stanislav,
>>>
>>> The sched_ext series [0] implements these callbacks. See
>>> bpf_scx_update() and bpf_scx_validate().
>>>
>>> [0]: https://lore.kernel.org/all/20230711011412.100319-13-tj@kernel.org/
>>>
>>> We could add this into that series and remove those callbacks, but this
>>> patch is fixing a UX / API issue with struct_ops links that's not really
>>> relevant to sched_ext. I don't think there's any reason to couple
>>> updating struct_ops map elements with allowing the kernel to manage the
>>> lifetime of struct_ops maps -- just because we only have 1 (non-test)
>
> Agree the link-update does not necessarily couple with link-creation, so
> removing 'link' update function enforcement is ok. The intention was to
> avoid the struct_ops link inconsistent experience (one struct_ops link
> support update and another struct_ops link does not) because consistency
> was one of the reason for the true kernel backed link support that
> Kui-Feng did. tcp-cc is the only one for now in struct_ops and it can
> support update, so the enforcement is here. I can see Stan's point that
> removing it now looks immature before a struct_ops landed in the kernel
> showing it does not make sense or very hard to support 'link' update.
> However, the scx patch set has shown this point, so I think it is good
> enough.
>
> For 'validate', it is not related a 'link' update. It is for the
> struct_ops 'map' update. If the loaded struct_ops map is invalid, it
> will end up having a useless struct_ops map and no link can be created
> from it. I can see some struct_ops subsystem check all the 'ops'
> function for NULL before calling (like the FUSE RFC). I can also see
> some future struct_ops will prefer not to check NULL at all and prefer
> to assume a subset of the ops is always valid. Does having a 'validate'
> enforcement is blocking the scx patchset in some way? If not, I would
> like to keep this for now. Once it is removed, there is no turning back.
I am not saying which one is right or wrong, but the followings are some
of my concerns. Just FYI!
The 'validate' change more likes a default implementation that always
return 0. It is up to struct_ops types to decide how to validate
values. If they decide to always success, this change will save them
a bit of time. In opposite, allowing empty update may make difficulties
to the developers of new struct_ops types. New developers
may spend a lot of time in the code base to figure out that they should
implement an update function to make it work. A better document may
help. However, checking these function pointers at the first moment is
even better.
>
>>> struct_ops implementation in-tree doesn't mean we shouldn't improve APIs
>>> where it makes sense.
>>>
>>> Thanks,
>>> David
>>
>> Ack. I guess up to you and Martin. Just trying to understand whether I'm
>> missing something or the patch does indeed fix some use-case :-)
>
>
Powered by blists - more mailing lists