[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190123091634.GT27931@hirez.programming.kicks-ass.net>
Date: Wed, 23 Jan 2019 10:16:34 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Patrick Bellasi <patrick.bellasi@....com>
Cc: linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
linux-api@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
Tejun Heo <tj@...nel.org>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Viresh Kumar <viresh.kumar@...aro.org>,
Paul Turner <pjt@...gle.com>,
Quentin Perret <quentin.perret@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Morten Rasmussen <morten.rasmussen@....com>,
Juri Lelli <juri.lelli@...hat.com>,
Todd Kjos <tkjos@...gle.com>,
Joel Fernandes <joelaf@...gle.com>,
Steve Muckle <smuckle@...gle.com>,
Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: [PATCH v6 05/16] sched/core: uclamp: Update CPU's refcount on
clamp changes
On Tue, Jan 22, 2019 at 03:33:15PM +0000, Patrick Bellasi wrote:
> On 22-Jan 15:57, Peter Zijlstra wrote:
> > On Tue, Jan 22, 2019 at 02:01:15PM +0000, Patrick Bellasi wrote:
> > > Yes, I would say we have two options:
> > >
> > > 1) SCHED_FLAG_KEEP_POLICY enforces all the scheduling class specific
> > > attributes, but cross class attributes (e.g. uclamp)
> > >
> > > 2) add SCHED_KEEP_NICE, SCHED_KEEP_PRIO, and SCED_KEEP_PARAMS
> > > and use them in the if conditions in D)
> >
> > So the current KEEP_POLICY basically provides sched_setparam(), and
>
> But it's not exposed user-space.
Correct; not until your first patch indeed.
> > given we have that as a syscall, that is supposedly a useful
> > functionality.
>
> For uclamp is definitively useful to change clamps without the need to
> read beforehand the current policy params and use them in a following
> set syscall... which is racy pattern.
Right; but my argument was mostly that if sched_setparam() is a useful
interface, a 'pure' KEEP_POLICY would be too and your (1) looses that.
> > And I suppose the UTIL_CLAMP is !KEEP_UTIL; we could go either way
> > around with that flag.
>
> What about getting rid of the racy case above by exposing userspace
> only the new UTIL_CLAMP and, on:
>
> sched_setscheduler(flags: UTIL_CLAMP)
>
> we enforce the other two flags from the syscall:
>
> ---8<---
> SYSCALL_DEFINE3(sched_setattr)
> if (attr.sched_flags & SCHED_FLAG_KEEP_POLICY) {
> attr.sched_policy = SETPARAM_POLICY;
> attr.sched_flags |= (KEEP_POLICY|KEEP_PARAMS);
> }
> ---8<---
>
> This will not make possible to change class and set flags in one go,
> but honestly that's likely a very limited use-case, isn't it ?
So I must admit to not seeing much use for sched_setparam() (and its
equivalents) myself, but given it is an existing interface, I also think
it would be nice to cover that functionality in the sched_setattr()
call.
That is; I know of userspace priority-ceiling implementations using
sched_setparam(), but I don't see any reason why that wouldn't also work
with sched_setscheduler() (IOW always also set the policy).
> > > In both cases the goal should be to return from code block D).
> >
> > I don't think so; we really do want to 'goto change' for util changes
> > too I think. Why duplicate part of that logic?
>
> But that will force a dequeue/enqueue... isn't too much overhead just
> to change a clamp value?
These syscalls aren't what I consider fast paths anyway. However, there
are people that rely on the scheduler syscalls not to schedule
themselves, or rather be non-blocking (see for example that prio-ceiling
implementation).
And in that respect the newly introduced uclamp_mutex does appear to be
a problem.
Also; do you expect these clamp values to be changed often?
> Perhaps we can also end up with some wired
s/wired/weird/, right?
> side-effects like the task being preempted ?
Nothing worse than any other random reschedule would cause.
> Consider also that the uclamp_task_update_active() added by this patch
> not only has lower overhead but it will be use also by cgroups where
> we want to force update all the tasks on a cgroup's clamp change.
I haven't gotten that far; but I would prefer not to have two different
'change' paths in __sched_setscheduler().
Powered by blists - more mailing lists