[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5bc41342-5ba6-68e9-8315-9e5cef65d102@redhat.com>
Date: Mon, 3 Jul 2023 10:55:02 -0400
From: Waiman Long <longman@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
linux-kernel@...r.kernel.org, Phil Auld <pauld@...hat.com>,
Brent Rowsell <browsell@...hat.com>,
Peter Hunt <pehunt@...hat.com>
Subject: Re: [PATCH] sched/core: Use empty mask to reset cpumasks in
sched_setaffinity()
On 7/3/23 06:26, Peter Zijlstra wrote:
> On Wed, Jun 28, 2023 at 05:16:37PM -0400, Waiman Long wrote:
>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>> cpumask"), user provided CPU affinity via sched_setaffinity(2) is
>> perserved even if the task is being moved to a different cpuset. However,
>> that affinity is also being inherited by any subsequently created child
>> processes which may not want or be aware of that affinity.
>>
>> One way to solve this problem is to provide a way to back off from
>> that user provided CPU affinity. This patch implements such a scheme
>> by using an empty cpumask to signal a reset of the cpumasks to the
>> default as allowed by the current cpuset.
>>
>> Before this patch, passing in an empty cpumask to sched_setaffinity(2)
>> will return an EINVAL error. With this patch, an error will no longer
>> be returned. Instead, the user_cpus_ptr that stores the user provided
>> affinity, if set, will be cleared and the task's CPU affinity will be
>> reset to that of the current cpuset. This reverts the cpumask change
>> done by all the previous sched_setaffinity(2) calls.
>>
> This is a user visible ABI change -- but with very limited motivation.
> Why do we want this? Who will use this?
Yes, this is a visible ABI change, but it should be backward compatible
as I doubt there are applications out there depending on the fact that
passing an empty cpumask to sched_setaffinity() must return an error.
Our OpenShift team has actually hit a problem with the recent persistent
user provided cpu affinity change because they are relying on the fact
that moving a task to a different cpuset will reset cpu affinity to the
cpuset default which is no longer true. That is the main reason behind
this patch to provide a way to reset cpu affinity to the cpuset default.
I am thinking of requesting sched_setaffinity(2) manpage update to
document the persistent user provided cpu affinity change and the way to
reset it after this patch is merged upstream.
Cheers,
Longman
Powered by blists - more mailing lists