lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <188455ae-edf1-1eba-572b-6c259ddf8d27@loongson.cn>
Date: Wed, 16 Oct 2024 14:20:27 +0800
From: maobibo <maobibo@...ngson.cn>
To: Ankur Arora <ankur.a.arora@...cle.com>,
 Catalin Marinas <catalin.marinas@....com>
Cc: linux-pm@...r.kernel.org, kvm@...r.kernel.org,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 will@...nel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
 pbonzini@...hat.com, wanpengli@...cent.com, vkuznets@...hat.com,
 rafael@...nel.org, daniel.lezcano@...aro.org, peterz@...radead.org,
 arnd@...db.de, lenb@...nel.org, mark.rutland@....com, harisokn@...zon.com,
 mtosatti@...hat.com, sudeep.holla@....com, cl@...two.org,
 misono.tomohiro@...itsu.com, joao.m.martins@...cle.com,
 boris.ostrovsky@...cle.com, konrad.wilk@...cle.com
Subject: Re: [PATCH v8 01/11] cpuidle/poll_state: poll via
 smp_cond_load_relaxed()



On 2024/10/16 上午5:32, Ankur Arora wrote:
> 
> Catalin Marinas <catalin.marinas@....com> writes:
> 
>> On Wed, Sep 25, 2024 at 04:24:15PM -0700, Ankur Arora wrote:
>>> diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
>>> index 9b6d90a72601..fc1204426158 100644
>>> --- a/drivers/cpuidle/poll_state.c
>>> +++ b/drivers/cpuidle/poll_state.c
>>> @@ -21,21 +21,20 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
>>>
>>>   	raw_local_irq_enable();
>>>   	if (!current_set_polling_and_test()) {
>>> -		unsigned int loop_count = 0;
>>>   		u64 limit;
>>>
>>>   		limit = cpuidle_poll_time(drv, dev);
>>>
>>>   		while (!need_resched()) {
>>> -			cpu_relax();
>>> -			if (loop_count++ < POLL_IDLE_RELAX_COUNT)
>>> -				continue;
>>> -
>>> -			loop_count = 0;
>>> +			unsigned int loop_count = 0;
>>>   			if (local_clock_noinstr() - time_start > limit) {
>>>   				dev->poll_time_limit = true;
>>>   				break;
>>>   			}
>>> +
>>> +			smp_cond_load_relaxed(&current_thread_info()->flags,
>>> +					      VAL & _TIF_NEED_RESCHED ||
>>> +					      loop_count++ >= POLL_IDLE_RELAX_COUNT);
>>
>> The above is not guaranteed to make progress if _TIF_NEED_RESCHED is
>> never set. With the event stream enabled on arm64, the WFE will
>> eventually be woken up, loop_count incremented and the condition would
>> become true.
> 
> That makes sense.
> 
>> However, the smp_cond_load_relaxed() semantics require that
>> a different agent updates the variable being waited on, not the waiting
>> CPU updating it itself.
> 
> Right. And, that seems to work well with the semantics of WFE. And,
> the event stream (if enabled) has a side effect that allows the exit
> from the loop.
> 
>> Also note that the event stream can be disabled
>> on arm64 on the kernel command line.
> 
> Yes, that's a good point. In patch-11 I tried to address that aspect
> by only allowing haltpoll to be force loaded.
> 
> But, I guess your point is that its not just haltpoll that has a problem,
> but also regular polling -- and maybe the right thing to do would be to
> disable polling if the event stream is disabled.
> 
>> Does the code above break any other architecture?
> 
> Me (and others) have so far tested x86, ARM64 (with/without the
> event stream), and I believe riscv. I haven't seen any obvious
> breakage. But, that's probably because most of the time somebody would
> be set TIF_NEED_RESCHED.
> 
>> I'd say if you want
>> something like this, better introduce a new smp_cond_load_timeout()
>> API. The above looks like a hack that may only work on arm64 when the
>> event stream is enabled.
> 
> I had a preliminary version of smp_cond_load_relaxed_timeout() here:
>   https://lore.kernel.org/lkml/87edae3a1x.fsf@oracle.com/
> 
> Even with an smp_cond_load_timeout(), we would need to fallback to
> something like the above for uarchs without WFxT.
> 
>> A generic option is udelay() (on arm64 it would use WFE/WFET by
>> default). Not sure how important it is for poll_idle() but the downside
>> of udelay() that it won't be able to also poll need_resched() while
>> waiting for the timeout. If this matters, you could instead make smaller
>> udelay() calls. Yet another problem, I don't know how energy efficient
>> udelay() is on x86 vs cpu_relax().
>>
>> So maybe an smp_cond_load_timeout() would be better, implemented with
>> cpu_relax() generically and the arm64 would use LDXR, WFE and rely on
>> the event stream (or fall back to cpu_relax() if the event stream is
>> disabled).
> 
> Yeah, something like that might work.
Yeah, I like idea about smp_cond_load_timeout method. If generic 
smp_cond_load_timeout method does not meet the requirement, the 
architecture can define itself one. And this keeps less modification 
about original code logic.

Regards
Bibo Mao
> 
> --
> ankur
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ