linux-kernel - Re: [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87o6rsr16w.fsf@oracle.com>
Date: Tue, 02 Sep 2025 14:31:35 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Catalin Marinas <catalin.marinas@....com>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-kernel@...r.kernel.org,
        linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        bpf@...r.kernel.org, arnd@...db.de, will@...nel.org,
        peterz@...radead.org, akpm@...ux-foundation.org, mark.rutland@....com,
        harisokn@...zon.com, cl@...two.org, ast@...nel.org, memxor@...il.com,
        zhenglifeng1@...wei.com, xueshuai@...ux.alibaba.com,
        joao.m.martins@...cle.com, boris.ostrovsky@...cle.com,
        konrad.wilk@...cle.com
Subject: Re: [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait()


Catalin Marinas <catalin.marinas@....com> writes:

> On Fri, Aug 29, 2025 at 01:07:35AM -0700, Ankur Arora wrote:
>> diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
>> index a385603436e9..ce8feadeb9a9 100644
>> --- a/arch/arm64/include/asm/rqspinlock.h
>> +++ b/arch/arm64/include/asm/rqspinlock.h
>> @@ -3,6 +3,9 @@
>>  #define _ASM_RQSPINLOCK_H
>>
>>  #include <asm/barrier.h>
>> +
>> +#define res_smp_cond_load_acquire_waiting() arch_timer_evtstrm_available()
>
> More on this below, I don't think we should define it.
>
>> diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
>> index 5ab354d55d82..8de1395422e8 100644
>> --- a/kernel/bpf/rqspinlock.c
>> +++ b/kernel/bpf/rqspinlock.c
>> @@ -82,6 +82,7 @@ struct rqspinlock_timeout {
>>  	u64 duration;
>>  	u64 cur;
>>  	u16 spin;
>> +	u8  wait;
>>  };
>>
>>  #define RES_TIMEOUT_VAL	2
>> @@ -241,26 +242,20 @@ static noinline int check_timeout(rqspinlock_t *lock, u32 mask,
>>  }
>>
>>  /*
>> - * Do not amortize with spins when res_smp_cond_load_acquire is defined,
>> - * as the macro does internal amortization for us.
>> + * Only amortize with spins when we don't have a waiting implementation.
>>   */
>> -#ifndef res_smp_cond_load_acquire
>>  #define RES_CHECK_TIMEOUT(ts, ret, mask)                              \
>>  	({                                                            \
>> -		if (!(ts).spin++)                                     \
>> +		if ((ts).wait || !(ts).spin++)		      \
>>  			(ret) = check_timeout((lock), (mask), &(ts)); \
>>  		(ret);                                                \
>>  	})
>> -#else
>> -#define RES_CHECK_TIMEOUT(ts, ret, mask)			      \
>> -	({ (ret) = check_timeout((lock), (mask), &(ts)); })
>> -#endif
>
> IIUC, RES_CHECK_TIMEOUT in the current res_smp_cond_load_acquire() usage
> doesn't amortise the spins, as the comment suggests, but rather the
> calls to check_timeout(). This is fine, it matches the behaviour of
> smp_cond_load_relaxed_timewait() you introduced in the first patch. The
> only difference is the number of spins - 200 (matching poll_idle) vs 64K
> above. Does 200 work for the above?

Works for me. I had added this because there seemed to be vast gulf between
64K and 200. Happy to drop this.

>>  /*
>>   * Initialize the 'spin' member.
>>   * Set spin member to 0 to trigger AA/ABBA checks immediately.
>>   */
>> -#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; })
>> +#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; (ts).wait = res_smp_cond_load_acquire_waiting(); })
>
> First of all, I don't really like the smp_cond_load_acquire_waiting(),
> that's an implementation detail of smp_cond_load_*_timewait() that
> shouldn't leak outside. But more importantly, RES_CHECK_TIMEOUT() is
> also used outside the smp_cond_load_acquire_timewait() condition. The
> (ts).wait check only makes sense when used together with the WFE
> waiting.
>
> I would leave RES_CHECK_TIMEOUT() as is for the stand-alone cases and
> just use check_timeout() in the smp_cond_load_acquire_timewait()
> scenarios. I would also drop the res_smp_cond_load_acquire() macro since
> you now defined smp_cond_load_acquire_timewait() generically and can be
> used directly.

Sounds good.

--
ankur