[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1289608777.27165.1584042470528.JavaMail.zimbra@efficios.com>
Date: Thu, 12 Mar 2020 15:47:50 -0400 (EDT)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Tejun Heo <tj@...nel.org>
Cc: Li Zefan <lizefan@...wei.com>, cgroups <cgroups@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Valentin Schneider <valentin.schneider@....com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [regression] cpuset: offlined CPUs removed from affinity masks
----- On Mar 12, 2020, at 2:26 PM, Tejun Heo tj@...nel.org wrote:
> Hello,
>
> On Sat, Mar 07, 2020 at 11:06:47AM -0500, Mathieu Desnoyers wrote:
>> Looking into solving this, one key issue seems to get in the way: cpuset
>> appear to care about not allowing to create a cpuset which has no currently
>> active CPU where to run, e.g.:
> ...
>> Clearly, there is an intent that cpusets take the active mask into
>> account to prohibit creating an empty cpuset, but nothing prevents
>> cpu hotplug from creating an empty cpuset.
>>
>> I wonder how to solve this inconsistency ?
>
> Please try cpuset in cgroup2. It shouldn't have those issues.
After figuring how to use cgroup2 (systemd.unified_cgroup_hierarchy=1 boot
parameter helped tremendously), and testing similar scenarios, it indeed
seems to have a much saner behavior than cgroup1.
Considering that the allowed cpu mask is weird wrt cgroup1 and cpu hotplug,
and that cgroup2 allows thread-level granularity, it does not make much sense
to prevent the pin_on_cpu() system call I am working on from pinning
on cpus which are not present in the allowed mask.
I'm currently investigating approaches that would detect situations
where a thread is pinned onto a CPU which is not part of its allowed
mask, and set the task prio at MAX_PRIO-1 (the lowest fair priority
possible) in those cases.
The basic idea is to allow applications to pin to every possible cpu, but
not allow them to use this to consume a lot of cpu time on CPUs they
are not allowed to run.
Thoughts ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists