[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e26cab96-efa5-4e70-88d7-66a1f3b10750@efficios.com>
Date: Tue, 27 Jan 2026 15:34:02 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: David Matlack <dmatlack@...gle.com>, Thomas Gleixner <tglx@...nel.org>,
Dmitry Vyukov <dvyukov@...gle.com>, Marco Elver <elver@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, LKML
<linux-kernel@...r.kernel.org>, Michael Jeanson <mjeanson@...icios.com>,
Jens Axboe <axboe@...nel.dk>, "Paul E. McKenney" <paulmck@...nel.org>,
X86 ML <x86@...nel.org>, Sean Christopherson <seanjc@...gle.com>,
Wei Liu <wei.liu@...nel.org>
Subject: Re: SIGSEGVs after 39a167560a61 ("rseq: Optimize event setting")
+CC Dmitry and Marco.
On 2026-01-26 17:35, Mathieu Desnoyers wrote:
> On 2026-01-26 17:27, David Matlack wrote:
>> On Mon, Jan 26, 2026 at 1:51 PM Thomas Gleixner <tglx@...nel.org> wrote:
> [...]
>>>> Perhaps this is the nudge Google needs to go fix this.
>>>
>>> The real question is whether the segfault is triggered from the rseq
>>> sanity checks or if the application segfaults becauses it relies on
>>> something something which is not guaranteed by the ABI. As this is
>>> secret sauce, I can't tell.
>>
>> I tried enabling /debug/rseq/debug but many of the daemons on my host
>> started crash-looping so much that I wasn't able to even run my test.
>>
>> Next I tried disabling CONFIG_RSEQ and as expected the issue went
>> away. I will use that for now to unblock my VFIO testing.
>>
>> I have reported the tcmalloc regression internally within Google to
>> figure out what next step they want to take.
>
> Note that I've proposed to help out the tcmalloc people a few
> times in the past years to fix this, but I've been told that
> it was not a priority on their end, and that they would not be
> able to even test whatever I would come up with.
>
> Thanks,
>
> Mathieu
>
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists