[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb701616-26b8-41f0-8a19-0f76b2a64deb@suse.cz>
Date: Wed, 23 Apr 2025 10:03:15 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Dave Airlie <airlied@...il.com>, Shakeel Butt <shakeel.butt@...ux.dev>,
Sebastian Sewior <bigeasy@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Alexei Starovoitov <ast@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 6.15-rc3
On 4/23/25 09:14, Vlastimil Babka wrote:
> On 4/23/25 01:37, Alexei Starovoitov wrote:
>> On Tue, Apr 22, 2025 at 4:01 PM Dave Airlie <airlied@...il.com> wrote:
>>>
>>> > Alexei Starovoitov (2):
>>> > locking/local_lock, mm: replace localtry_ helpers with
>>> > local_trylock_t type
>>>
>>> This seems to have upset some phoronix nginx workload
>>> https://www.phoronix.com/review/linux-615-nginx-regression/2
>>
>> 3x regression? wow.
>> Thanks for heads up.
>> I'm staring at the patch and don't see it.
>> Adding more experts.
>
> Incidentally my work on slab sheaves using local_trylock() got to a phase
> yesterday when after rebasing on rc3 and some refactoring I was looking at
> sheaf stats that showed the percpu sheaves were used exactly once per cpu,
> and other attempts failed. Which would be explained by local_trylock()
> failing. In the context of rc3 itself it would mean the memcg stocks aren't
> used at all because they can't be try-locked. Which could make benchmarks
> unhappy of course, although surprising that it would be that much.
>
> What I suspect now is the _Generic() part doesn't work as expected. So consider:
>
> local_trylock() (or _irqsave variant) has no _Generic() part, does the
> "if (READ_ONCE(tl->acquired))" and "WRITE_ONCE(tl->acquired, 1)" directly,
> succeeds the first attempt on each cpu where executed.
>
> local_unlock() goes via __local_lock_release() and since the _Generic() part
> there doesn't work, we don't do WRITE_ONCE(tl->acquired, 0); so it stays 1.
>
> preempt or irq handling is fine so nothing like lockdep, preempt debugging,
> watchdogs gets suspicious, just the cpu can never succeed local_trylock() again
>
> local_lock(_irqsave()) uses __local_lock_acquire() which also has a
> _Generic() part but since it doesn't work, the "lockdep_assert(tl->acquired
> == 0);" there isn't triggered either
>
> In fact I've put BUG() in the _Generic() sections of _acquire() and _release()
> and it didn't trigger, which would prove the code isn't executed. But I don't
> know why _Generic() doesn't recognize the correct type there.
>
> --- a/include/linux/local_lock_internal.h
> +++ b/include/linux/local_lock_internal.h
> @@ -104,6 +104,7 @@ do { \
> _Generic((lock), \
> local_trylock_t *: ({ \
> lockdep_assert(tl->acquired == 0); \
> + BUG(); \
> WRITE_ONCE(tl->acquired, 1); \
> }), \
> default:(void)0); \
> @@ -173,6 +174,7 @@ do { \
> _Generic((lock), \
> local_trylock_t *: ({ \
> lockdep_assert(tl->acquired == 1); \
> + BUG(); \
> WRITE_ONCE(tl->acquired, 0); \
> }), \
> default:(void)0); \
>
Oh I see, replacing the default: which "local_lock_t *:" which is the only
other expected type, forces the compiler to actually tell me what's wrong:
./include/linux/local_lock_internal.h:174:26: error: ‘_Generic’ selector of
type ‘__seg_gs local_lock_t *’ is not compatible with any association
Powered by blists - more mailing lists