[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <875xdnk0ju.fsf@oracle.com>
Date: Fri, 12 Sep 2025 11:06:45 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Catalin Marinas <catalin.marinas@....com>
Cc: Ankur Arora <ankur.a.arora@...cle.com>,
        Kumar Kartikeya Dwivedi
 <memxor@...il.com>,
        linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, bpf@...r.kernel.org,
        arnd@...db.de, will@...nel.org, peterz@...radead.org,
        akpm@...ux-foundation.org, mark.rutland@....com, harisokn@...zon.com,
        cl@...two.org, ast@...nel.org, zhenglifeng1@...wei.com,
        xueshuai@...ux.alibaba.com, joao.m.martins@...cle.com,
        boris.ostrovsky@...cle.com, konrad.wilk@...cle.com
Subject: Re: [PATCH v5 5/5] rqspinlock: Use smp_cond_load_acquire_timeout()
Catalin Marinas <catalin.marinas@....com> writes:
> On Thu, Sep 11, 2025 at 02:58:22PM -0700, Ankur Arora wrote:
>>
>> Kumar Kartikeya Dwivedi <memxor@...il.com> writes:
>>
>> > On Thu, 11 Sept 2025 at 16:32, Catalin Marinas <catalin.marinas@....com> wrote:
>> >>
>> >> On Wed, Sep 10, 2025 at 08:46:55PM -0700, Ankur Arora wrote:
>> >> > Switch out the conditional load inerfaces used by rqspinlock
>> >> > to smp_cond_read_acquire_timeout().
>> >> > This interface handles the timeout check explicitly and does any
>> >> > necessary amortization, so use check_timeout() directly.
>> >>
>> >> It's worth mentioning that the default smp_cond_load_acquire_timeout()
>> >> implementation (without hardware support) only spins 200 times instead
>> >> of 16K times in the rqspinlock code. That's probably fine but it would
>> >> be good to have confirmation from Kumar or Alexei.
>> >>
>> >
>> > This looks good, but I would still redefine the spin count from 200 to
>> > 16k for rqspinlock.c, especially because we need to keep
>> > RES_CHECK_TIMEOUT around which still uses 16k spins to amortize
>> > check_timeout.
>>
>> By my count that amounts to ~100us per check_timeout() on x86
>> systems I've tested with cpu_relax(). Which seems quite reasonable.
>>
>> 16k also seems safer on CPUs where cpu_relax() is basically a NOP.
>
> Does this spin count work for poll_idle()? I don't remember where the
> 200 value came from.
Just reusing the value of POLL_IDLE_RELAX_COUNT which is is defined as
200.
For the poll_idle() case I don't think the value of 200 makes sense
for all architectures, so they'll need to redefine it (before defining
ARCH_HAS_OPTIMIZED_POLL which gates poll_idle().)
--
ankur
Powered by blists - more mailing lists
 
