lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87tt3wqwt7.fsf@oracle.com>
Date: Mon, 30 Jun 2025 22:55:48 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Ankur Arora <ankur.a.arora@...cle.com>
Cc: "Christoph Lameter (Ampere)" <cl@...two.org>, linux-kernel@...r.kernel.org,
        linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        bpf@...r.kernel.org, arnd@...db.de, catalin.marinas@....com,
        will@...nel.org, peterz@...radead.org, akpm@...ux-foundation.org,
        mark.rutland@....com, harisokn@...zon.com, ast@...nel.org,
        memxor@...il.com, zhenglifeng1@...wei.com, xueshuai@...ux.alibaba.com,
        joao.m.martins@...cle.com, boris.ostrovsky@...cle.com,
        konrad.wilk@...cle.com
Subject: Re: [PATCH v3 5/5] arm64: barrier: Handle waiting in
 smp_cond_load_relaxed_timewait()


Ankur Arora <ankur.a.arora@...cle.com> writes:

> Christoph Lameter (Ampere) <cl@...two.org> writes:
>
>> On Thu, 26 Jun 2025, Ankur Arora wrote:
>>
>>> @@ -222,6 +223,53 @@ do {									\
>>>  #define __smp_timewait_store(ptr, val)					\
>>>  		__cmpwait_relaxed(ptr, val)
>>>
>>> +/*
>>> + * Redefine ARCH_TIMER_EVT_STREAM_PERIOD_US locally to avoid include hell.
>>> + */
>>> +#define __ARCH_TIMER_EVT_STREAM_PERIOD_US 100UL
>>> +extern bool arch_timer_evtstrm_available(void);
>>> +
>>> +static inline u64 ___smp_cond_spinwait(u64 now, u64 prev, u64 end,
>>> +				       u32 *spin, bool *wait, u64 slack);
>>> +/*
>>> + * To minimize time spent spinning, we want to allow a large overshoot.
>>> + * So, choose a default slack value of the event-stream period.
>>> + */
>>> +#define SMP_TIMEWAIT_DEFAULT_US __ARCH_TIMER_EVT_STREAM_PERIOD_US
>>> +
>>> +static inline u64 ___smp_cond_timewait(u64 now, u64 prev, u64 end,
>>> +				       u32 *spin, bool *wait, u64 slack)
>>> +{
>>> +	bool wfet = alternative_has_cap_unlikely(ARM64_HAS_WFXT);
>>> +	bool wfe, ev = arch_timer_evtstrm_available();
>>
>> An unitialized and initialized variable on the same line. Maybe separate
>> that. Looks confusing and unusual to me.
>
> Yeah, that makes sense. Will fix.
>
>>> +	u64 evt_period = __ARCH_TIMER_EVT_STREAM_PERIOD_US;
>>> +	u64 remaining = end - now;
>>> +
>>> +	if (now >= end)
>>> +		return 0;
>>> +	/*
>>> +	 * Use WFE if there's enough slack to get an event-stream wakeup even
>>> +	 * if we don't come out of the WFE due to natural causes.
>>> +	 */
>>> +	wfe = ev && ((remaining + slack) > evt_period);
>>
>> The line above does not matter for the wfet case and the calculation is
>> ignored. We hope that in the future wfet will be the default case.
>
> My assumption was that the compiler would only evaluate the evt_period
> comparison if the wfet check evaluates false and it does need to check
> if wfe is true or not.
>
> But let me look at the generated code.

So, I was too optimistic. The compiler doesn't do this optimization at
all. I'm guessing that's because of at least two reasons:

The wfet case is unlikely:

   bool wfet = alternative_has_cap_unlikely(ARM64_HAS_WFXT);

And, I'm testing for:

   if (wfe || wfet)

This last check should have been "if (wfet || wfe)"" since the first
clause is pretty much free.

But even with that fixed, I don't think the compiler will do the right thing.

I could condition the computation on arch_timer_evtstrm_available(), but I
would prefer to keep this code straightforward since it will only be
evaluated infrequently.

But, let me see what I can do.

--
ankur

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ