[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YaE5pDv5OhkAC+a4@carbon.dhcp.thefacebook.com>
Date: Fri, 26 Nov 2021 11:46:44 -0800
From: Roman Gushchin <guro@...com>
To: Yafang Shao <laoar.shao@...il.com>
CC: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Mel Gorman <mgorman@...hsingularity.net>,
bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH rfc 0/6] Scheduler BPF
On Thu, Nov 25, 2021 at 02:00:04PM +0800, Yafang Shao wrote:
> Hi Roman,
Hi Yafang!
>
> Scheduler BPF is a great idea.
> Thanks for the work.
Thanks!
>
> Scheduler BPF won’t be a small feature, I think we’d better give a
> summary of possible hooks it may add first.
> We must have a *basic rule* to control what it will tend to be to
> avoid adding BPF hooks here and there.
> I haven’t found a clear rule yet, but maybe we can learn it from
> netfilter, which has 5 basic hooks.
> Regarding the scheduler BPF hooks, some possible basic hooks may be:
> - Hook for Enqueue
> - Hook for Dequeue
> - Hook for Put Prev Task
> - Hook for Set Next Task
I think it depends on what we want to achieve. There are several options:
we might aim to implement the whole scheduler logic in bpf, we might aim
to do some adjustments to the existing scheduler behavior or a mix of those
approaches.
Bpf as now is now is not capable enough to implement a new scheduler class
without a substantial amount of new c code (in form of helpers, maybe custom
maps, some verifier changes etc). In particular, it's a challenging to
provide strong safety guarantees: any scheduler bpf program loaded shouldn't
crash or deadlock the system (otherwise bpf isn't any better than a kernel
module). Also performance margins are quite tight.
I'm not saying that providing such generic hooks is impossible or useless,
but it requires a lot of changes and support code and I'm not sure that we have
a good justification for them right now.
I think instead we might want to see bpf hooks as a better form of (sysctl)
tunables, which are more flexible (e.g. can be used for specific processes,
cgroups, cpus, being enabled depending on load, weather, etc) and do not create
an ABI (so are easier to maintain).
>
>
> > An example of an userspace part, which loads some simple hooks is available
> > here [3]. It's very simple, provided only to simplify playing with the provided
> > kernel patches.
> >
>
> You’d better add this userspace code into samples/bpf/.
I thought samples/bpf was considered deprecated (in favor to selftests/bpf/),
but I'm gonna check with bpf maintainers. Thanks for the idea!
Powered by blists - more mailing lists