[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87zfbvwx0n.ffs@tglx>
Date: Tue, 19 Aug 2025 10:12:08 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: "bigeasy@...utronix.de" <bigeasy@...utronix.de>
Cc: Prakash Sangappa <prakash.sangappa@...cle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"peterz@...radead.org" <peterz@...radead.org>, "rostedt@...dmis.org"
<rostedt@...dmis.org>, "mathieu.desnoyers@...icios.com"
<mathieu.desnoyers@...icios.com>, "kprateek.nayak@....com"
<kprateek.nayak@....com>, "vineethr@...ux.ibm.com"
<vineethr@...ux.ibm.com>
Subject: Re: [PATCH V7 02/11] sched: Indicate if thread got rescheduled
On Mon, Aug 18 2025 at 15:16, bigeasy@...utronix.de wrote:
> On 2025-08-13 18:56:16 [+0200], Thomas Gleixner wrote:
>> On Wed, Aug 13 2025 at 18:19, bigeasy@...utronix.de wrote:
>> > I spent some time on the review. I tried to test it but for some reason
>> > userland always segfaults. This is not subject to your changes because
>> > param_test (from tools/testing/selftests/rseq) also segfaults. Also on a
>> > Debian v6.12. So this must be something else and maybe glibc related.
>>
>> Hrm. I did not run the rseq tests. I only used the test I wrote, but
>> that works and the underlying glibc uses rseq too, but I might have
>> screwed up there. As I said it's POC. I'm about to send out the polished
>> version, which survive the selftests nicely :)
>
> It was not your code. Everything exploded here. Am right to assume that
> you had a recent/ current Debian Trixie environment testing? My guess is
> that glibc or gcc got out of sync.
https://lore.kernel.org/lkml/aKODByTQMYFs3WVN@google.com
:)
>> > gcc has __atomic_fetch_and() and __atomic_fetch_or() provided as
>> > built-ins.
>> > There is atomic_fetch_and_explicit() and atomic_fetch_or_explicit()
>> > provided by <stdatomic.h>. Mostly the same magic.
>> >
>> > If you use this like
>> > | static inline int test_and_clear_bit(unsigned long *ptr, unsigned int bit)
>> > | {
>> > | return __atomic_fetch_and(ptr, ~(1 << bit), __ATOMIC_RELAXED) & (1 << bit);
>> > | }
>> >
>> > the gcc will emit btr. Sadly the lock prefix will be there, too. On the
>> > plus side you would have logic for every architecture.
>>
>> I know, but the whole point is to avoid the LOCK prefix because it's not
>> necessary in this context and slows things down. The only requirement is
>> CPU local atomicity vs. an interrupt/exception/NMI or whatever the CPU
>> uses to mess things up. You need LOCK if you have cross CPU concurrency,
>> which is not the case here. The LOCK is very measurable when you use
>> this pattern with a high frequency and that's what the people who long
>> for this do :)
>
> Sure. You can keep it on x86 and use the generic one in the else case
> rather than abort with an error.
> Looking at arch___test_and_clear_bit() in the kernel, there is x86 with
> its custom implementation. s390 points to generic___test_and_clear_bit()
> which is a surprise. alpha's and sh's isn't atomic so this does not look
> right. hexagon and m68k might okay and a candidate.
Right, I'll look into that after I sorted out the underlying rseq
mess. See the context of the link above. That solved will make the
integration of this timeslice muck way simpler (famous last words).
Thanks,
tglx
Powered by blists - more mailing lists