[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+a1RRx-NK1H-iyuqwEs1kHfUsQBHRU7OsK7zHPmjVHSzw@mail.gmail.com>
Date: Fri, 5 Apr 2024 06:28:02 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Thomas Gleixner <tglx@...utronix.de>, John Stultz <jstultz@...gle.com>,
Marco Elver <elver@...gle.com>, Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, kasan-dev@...glegroups.com,
Edward Liaw <edliaw@...gle.com>, Carlos Llamas <cmllamas@...gle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH v6 1/2] posix-timers: Prefer delivery of signals to the
current thread
On Thu, 4 Apr 2024 at 15:45, Oleg Nesterov <oleg@...hat.com> wrote:
>
> Perhaps I am totally confused, but.
>
> On 04/04, Dmitry Vyukov wrote:
> >
> > On Wed, 3 Apr 2024 at 17:43, Thomas Gleixner <tglx@...utronix.de> wrote:
> > >
> > > > Why distribution_thread() can't simply exit if got_signal != 0 ?
> > > >
> > > > See https://lore.kernel.org/all/20230128195641.GA14906@redhat.com/
> > >
> > > Indeed. It's too obvious :)
> >
> > This test models the intended use-case that was the motivation for the change:
> > We want to sample execution of a running multi-threaded program, it
> > has multiple active threads (that don't exit), since all threads are
> > running and consuming CPU,
>
> Yes,
>
> > they all should get a signal eventually.
>
> Well, yes and no.
>
> No, in a sense that the motivation was not to ensure that all threads
> get a signal, the motivation was to ensure that cpu_timer_fire() paths
> will use the current task as the default target for signal_wake_up/etc.
> This is just optimization.
>
> But yes, all should get a signal eventually. And this will happen with
> or without the commit bcb7ee79029dca ("posix-timers: Prefer delivery of
> signals to the current thread"). Any thread can dequeue a shared signal,
> say, on return from interrupt.
>
> Just without that commit this "eventually" means A_LOT_OF_TIME statistically.
I agree that any thread can pick the signal, but this A_LOT_OF_TIME
makes it impossible for the test to reliably repeatedly pass w/o the
change in any reasonable testing system.
With the change the test was finishing/passing for me immediately all the time.
Again, if the test causes practical problems (flaky), then I don't
mind relaxing it (flaky tests suck). I was just against giving up on
testing proactively just in case.
> > If threads will exit once they get a signal,
>
> just in case, the main thread should not exit ...
>
> > then the test will pass
> > even if signal delivery is biased towards a single running thread all
> > the time (the previous kernel impl).
>
> See above.
>
> But yes, I agree, if thread exits once it get a signal, then A_LOT_OF_TIME
> will be significantly decreased. But again, this is just statistical issue,
> I do not see how can we test the commit bcb7ee79029dca reliably.
>
> OTOH. If the threads do not exit after they get signal, then _in theory_
> nothing can guarantee that this test-case will ever complete even with
> that commit. It is possible that one of the threads will "never" have a
> chance to run cpu_timer_fire().
>
> In short, I leave this to you and Thomas. I have no idea how to write a
> "good" test for that commit.
>
> Well... perhaps the main thread should just sleep in pause(), and
> distribution_handler() should check that gettid() != getpid() ?
> Something like this maybe... We need to ensure that the main thread
> enters pause before timer_settime().
>
> Oleg.
>
Powered by blists - more mailing lists