[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230324143247.GA27199@willie-the-truck>
Date: Fri, 24 Mar 2023 14:32:50 +0000
From: Will Deacon <will@...nel.org>
To: Waiman Long <longman@...hat.com>
Cc: Michal Koutný <mkoutny@...e.com>,
Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>,
Shuah Khan <shuah@...nel.org>, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 3/5] cgroup/cpuset: Find another usable CPU if none found
in current cpuset
On Fri, Mar 17, 2023 at 10:59:26AM -0400, Waiman Long wrote:
> On 3/17/23 08:27, Michal Koutný wrote:
> > On Tue, Mar 14, 2023 at 04:22:06PM -0400, Waiman Long <longman@...hat.com> wrote:
> > > Some arm64 systems can have asymmetric CPUs where certain tasks are only
> > > runnable on a selected subset of CPUs.
> > Ah, I'm catching up.
> >
> > > This information is not captured in the cpuset. As a result,
> > > task_cpu_possible_mask() may return a mask that have no overlap with
> > > effective_cpus causing new_cpus to become empty.
> > I can see that historically, there was an approach of terminating
> > unaccomodable tasks:
> > 94f9c00f6460 ("arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores")
> > the removal of killing had been made possible with
> > df950811f4a8 ("arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system").
> >
> > That gives two other alternatives to affinity modification:
> > 2) kill such tasks (not unlike OOM upon memory.max reduction),
> > 3) reject cpuset reduction (violates cgroup v2 delegation).
> >
> > What do you think about 2)?
>
> Yes, killing it is one possible solution.
>
> (3) doesn't work if the affinity change is due to hot cpu removal. So that
> leaves this patch or (2) as the only alternative. I would like to hear what
> Will and Tejun thinks about it.
The main constraint from the Android side (the lucky ecosystem where these
SoCs tend to show up) is that existing userspace (including 32-bit binaries)
continues to function without modification. So approaches such as killing
tasks or rejecting system calls tend not to work as well, since you
inevitably get divergent behaviour leading to functional breakage rather
than e.g. performance anomalies.
Having said that, the behaviour we currently have in mainline seems to
be alright, so please don't go out of your way to accomodate these SoCs.
I'm mainly just concerned about introducing any regressions, which is why
I ran my tests on this series
Cheers,
Will
Powered by blists - more mailing lists