[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZzYxv2RfDwegDMEf@arm.com>
Date: Thu, 14 Nov 2024 17:22:07 +0000
From: Catalin Marinas <catalin.marinas@....com>
To: "Christoph Lameter (Ampere)" <cl@...two.org>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-pm@...r.kernel.org,
kvm@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
will@...nel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
pbonzini@...hat.com, vkuznets@...hat.com, rafael@...nel.org,
daniel.lezcano@...aro.org, peterz@...radead.org, arnd@...db.de,
lenb@...nel.org, mark.rutland@....com, harisokn@...zon.com,
mtosatti@...hat.com, sudeep.holla@....com, maz@...nel.org,
misono.tomohiro@...itsu.com, maobibo@...ngson.cn,
zhenglifeng1@...wei.com, joao.m.martins@...cle.com,
boris.ostrovsky@...cle.com, konrad.wilk@...cle.com
Subject: Re: [PATCH v9 01/15] asm-generic: add barrier
smp_cond_load_relaxed_timeout()
On Fri, Nov 08, 2024 at 11:41:08AM -0800, Christoph Lameter (Ampere) wrote:
> On Thu, 7 Nov 2024, Ankur Arora wrote:
> > > Calling the clock retrieval function repeatedly should be fine and is
> > > typically done in user space as well as in kernel space for functions that
> > > need to wait short time periods.
> >
> > The problem is that you might have multiple CPUs polling in idle
> > for prolonged periods of time. And, so you want to minimize
> > your power/thermal envelope.
>
> On ARM that maps to YIELD which does not do anything for the power
> envelope AFAICT. It switches to the other hyperthread.
The issue is not necessarily arm64 but poll_idle() on other
architectures like x86 where, at the end of this series, they still call
cpu_relax() in a loop and check local_clock() every 200 times or so
iterations. So I wouldn't want to revert the improvement in 4dc2375c1a4e
("cpuidle: poll_state: Avoid invoking local_clock() too often").
I agree that the 200 iterations here it's pretty random and it was
something made up for poll_idle() specifically and it could increase the
wait period in other situations (or other architectures).
OTOH, I'm not sure we want to make this API too complex if the only
user for a while would be poll_idle(). We could add a comment that the
timeout granularity can be pretty coarse and architecture dependent (200
cpu_relax() calls in one deployment, 100us on arm64 with WFE).
--
Catalin
Powered by blists - more mailing lists