lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87frw2axv0.ffs@tglx>
Date: Thu, 04 Apr 2024 00:24:19 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: John Stultz <jstultz@...gle.com>
Cc: Oleg Nesterov <oleg@...hat.com>, Marco Elver <elver@...gle.com>, Peter
 Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, "Eric W.
 Biederman" <ebiederm@...ssion.com>, linux-kernel@...r.kernel.org,
 linux-kselftest@...r.kernel.org, Dmitry Vyukov <dvyukov@...gle.com>,
 kasan-dev@...glegroups.com, Edward Liaw <edliaw@...gle.com>, Carlos Llamas
 <cmllamas@...gle.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH v6 1/2] posix-timers: Prefer delivery of signals to the
 current thread

On Wed, Apr 03 2024 at 12:35, John Stultz wrote:
> On Wed, Apr 3, 2024 at 12:10 PM Thomas Gleixner <tglx@...utronix.de> wrote:
>>
>> On Wed, Apr 03 2024 at 11:16, John Stultz wrote:
>> > On Wed, Apr 3, 2024 at 9:32 AM Thomas Gleixner <tglx@...utronixde> wrote:
>> > Thanks for this, Thomas!
>> >
>> > Just FYI: testing with 6.1, the test no longer hangs, but I don't see
>> > the SKIP behavior. It just fails:
>> > not ok 6 check signal distribution
>> > # Totals: pass:5 fail:1 xfail:0 xpass:0 skip:0 error:0
>> >
>> > I've not had time yet to dig into what's going on, but let me know if
>> > you need any further details.
>>
>> That's weird. I ran it on my laptop with 6.1.y ...
>>
>> What kind of machine is that?
>
> I was running it in a VM.
>
> Interestingly with 64cpus it sometimes will do the skip behavior, but
> with 4 cpus it seems to always fail.

Duh, yes. The problem is that any thread might grab the signal as it is
process wide.

What was I thinking? Not much obviously.

The distribution mechanism is only targeting the wakeup at signal
queuing time and therefore avoids the wakeup of idle tasks. But it does
not guarantee that the signal is evenly distributed to the threads on
actual signal delivery.

Even with the change to stop the worker threads when they got a signal
it's not guaranteed that the last worker will actually get one within
the timeout simply because the main thread can win the race to collect
the signal every time. I just managed to make the patched test fail in
one out of 100 runs.

IOW, we cannot test this reliably at all with the current approach.

I'll think about it tomorrow again with brain awake.

Thanks,

        tglx


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ