lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87zfm9z812.fsf@oracle.com>
Date: Fri, 08 Nov 2024 14:15:53 -0800
From: Ankur Arora <ankur.a.arora@...cle.com>
To: "Christoph Lameter (Ampere)" <cl@...two.org>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-pm@...r.kernel.org,
        kvm@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
        catalin.marinas@....com, will@...nel.org, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        x86@...nel.org, hpa@...or.com, pbonzini@...hat.com,
        vkuznets@...hat.com, rafael@...nel.org, daniel.lezcano@...aro.org,
        peterz@...radead.org, arnd@...db.de, lenb@...nel.org,
        mark.rutland@....com, harisokn@...zon.com, mtosatti@...hat.com,
        sudeep.holla@....com, maz@...nel.org, misono.tomohiro@...itsu.com,
        maobibo@...ngson.cn, zhenglifeng1@...wei.com,
        joao.m.martins@...cle.com, boris.ostrovsky@...cle.com,
        konrad.wilk@...cle.com
Subject: Re: [PATCH v9 01/15] asm-generic: add barrier
 smp_cond_load_relaxed_timeout()


Christoph Lameter (Ampere) <cl@...two.org> writes:

> On Thu, 7 Nov 2024, Ankur Arora wrote:
>
>> > Calling the clock retrieval function repeatedly should be fine and is
>> > typically done in user space as well as in kernel space for functions that
>> > need to wait short time periods.
>>
>> The problem is that you might have multiple CPUs polling in idle
>> for prolonged periods of time. And, so you want to minimize
>> your power/thermal envelope.
>
> On ARM that maps to YIELD which does not do anything for the power
> envelope AFAICT. It switches to the other hyperthread.

Agreed. For arm64 patch-5 adds a specialized version.

For the fallback case when we don't have an event stream, the
arm64 version does use the same cpu_relax() loop but that's
not a production thing.

>> For instance see commit 4dc2375c1a4e "cpuidle: poll_state: Avoid
>> invoking local_clock() too often" which originally added a similar
>> rate limit to poll_idle() where they saw exactly that issue.
>
> Looping w/o calling local_clock may increase the wait period etc.

Yeah. I don't think that's a real problem for the poll_idle()
case as the only thing waiting on the other side of the possibly
delayed timer is a deeper idle state.

But, for any other potential users the looping duration might be
too long (the generated code for x86 will execute around 200 * 7
instructions before checking the timer, so a worst case delay of
say around 1-2us.)

I'll note that in the comment around smp_cond_time_check_count
just to warn any future users.

> For power saving most arches have special instructions like ARMS
> WFE/WFET. These are then causing more accurate wait times than the looping
> thing?

Definitely true for WFET. The WFE can still overshoot because the
eventstream has a period of 100us.

--
ankur

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ