[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ttddrr1r.fsf@oracle.com>
Date: Tue, 15 Oct 2024 14:32:00 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Catalin Marinas <catalin.marinas@....com>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-pm@...r.kernel.org,
kvm@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, will@...nel.org, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, pbonzini@...hat.com,
wanpengli@...cent.com, vkuznets@...hat.com, rafael@...nel.org,
daniel.lezcano@...aro.org, peterz@...radead.org, arnd@...db.de,
lenb@...nel.org, mark.rutland@....com, harisokn@...zon.com,
mtosatti@...hat.com, sudeep.holla@....com, cl@...two.org,
misono.tomohiro@...itsu.com, maobibo@...ngson.cn,
joao.m.martins@...cle.com, boris.ostrovsky@...cle.com,
konrad.wilk@...cle.com
Subject: Re: [PATCH v8 01/11] cpuidle/poll_state: poll via
smp_cond_load_relaxed()
Catalin Marinas <catalin.marinas@....com> writes:
> On Wed, Sep 25, 2024 at 04:24:15PM -0700, Ankur Arora wrote:
>> diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
>> index 9b6d90a72601..fc1204426158 100644
>> --- a/drivers/cpuidle/poll_state.c
>> +++ b/drivers/cpuidle/poll_state.c
>> @@ -21,21 +21,20 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
>>
>> raw_local_irq_enable();
>> if (!current_set_polling_and_test()) {
>> - unsigned int loop_count = 0;
>> u64 limit;
>>
>> limit = cpuidle_poll_time(drv, dev);
>>
>> while (!need_resched()) {
>> - cpu_relax();
>> - if (loop_count++ < POLL_IDLE_RELAX_COUNT)
>> - continue;
>> -
>> - loop_count = 0;
>> + unsigned int loop_count = 0;
>> if (local_clock_noinstr() - time_start > limit) {
>> dev->poll_time_limit = true;
>> break;
>> }
>> +
>> + smp_cond_load_relaxed(¤t_thread_info()->flags,
>> + VAL & _TIF_NEED_RESCHED ||
>> + loop_count++ >= POLL_IDLE_RELAX_COUNT);
>
> The above is not guaranteed to make progress if _TIF_NEED_RESCHED is
> never set. With the event stream enabled on arm64, the WFE will
> eventually be woken up, loop_count incremented and the condition would
> become true.
That makes sense.
> However, the smp_cond_load_relaxed() semantics require that
> a different agent updates the variable being waited on, not the waiting
> CPU updating it itself.
Right. And, that seems to work well with the semantics of WFE. And,
the event stream (if enabled) has a side effect that allows the exit
from the loop.
> Also note that the event stream can be disabled
> on arm64 on the kernel command line.
Yes, that's a good point. In patch-11 I tried to address that aspect
by only allowing haltpoll to be force loaded.
But, I guess your point is that its not just haltpoll that has a problem,
but also regular polling -- and maybe the right thing to do would be to
disable polling if the event stream is disabled.
> Does the code above break any other architecture?
Me (and others) have so far tested x86, ARM64 (with/without the
event stream), and I believe riscv. I haven't seen any obvious
breakage. But, that's probably because most of the time somebody would
be set TIF_NEED_RESCHED.
> I'd say if you want
> something like this, better introduce a new smp_cond_load_timeout()
> API. The above looks like a hack that may only work on arm64 when the
> event stream is enabled.
I had a preliminary version of smp_cond_load_relaxed_timeout() here:
https://lore.kernel.org/lkml/87edae3a1x.fsf@oracle.com/
Even with an smp_cond_load_timeout(), we would need to fallback to
something like the above for uarchs without WFxT.
> A generic option is udelay() (on arm64 it would use WFE/WFET by
> default). Not sure how important it is for poll_idle() but the downside
> of udelay() that it won't be able to also poll need_resched() while
> waiting for the timeout. If this matters, you could instead make smaller
> udelay() calls. Yet another problem, I don't know how energy efficient
> udelay() is on x86 vs cpu_relax().
>
> So maybe an smp_cond_load_timeout() would be better, implemented with
> cpu_relax() generically and the arm64 would use LDXR, WFE and rely on
> the event stream (or fall back to cpu_relax() if the event stream is
> disabled).
Yeah, something like that might work.
--
ankur
Powered by blists - more mailing lists