linux-kernel - Re: [PATCH v4] posix-timers: Prefer delivery of signals to the current thread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Thu, 2 Feb 2023 08:36:34 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Marco Elver <elver@...gle.com>, oleg@...hat.com,
        linux-kernel@...r.kernel.org,
        "Eric W . Biederman" <ebiederm@...ssion.com>,
        Frederic Weisbecker <frederic@...nel.org>
Subject: Re: [PATCH v4] posix-timers: Prefer delivery of signals to the
 current thread

,On Fri, 27 Jan 2023 at 07:58, Dmitry Vyukov <dvyukov@...gle.com> wrote:
>
> On Thu, 26 Jan 2023 at 20:57, Thomas Gleixner <tglx@...utronix.de> wrote:
> >
> > On Thu, Jan 26 2023 at 18:51, Marco Elver wrote:
> > > On Thu, 26 Jan 2023 at 16:41, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> > >>
> > >> Prefer to deliver signals to the current thread if SIGEV_THREAD_ID
> > >> is not set. We used to prefer the main thread, but delivering
> > >> to the current thread is both faster, and allows to sample actual thread
> > >> activity for CLOCK_PROCESS_CPUTIME_ID timers, and does not change
> > >> the semantics (since we queue into shared_pending, all thread may
> > >> receive the signal in both cases).
> > >
> > > Reviewed-by: Marco Elver <elver@...gle.com>
> > >
> > > Nice - and and given the test, hopefully this behaviour won't regress in future.
> >
> > The test does not tell much. It just waits until each thread got a
> > signal once, which can take quite a while. It does not tell about the
> > distribution of the signals, which can be completely randomly skewed
> > towards a few threads.
> >
> > Also for real world use cases where you have multiple threads with
> > different periods and runtime per period I have a hard time to
> > understand how that signal measures anything useful.
> >
> > The most time consuming thread might actually trigger rarely while other
> > short threads end up being the ones which cause the timer to fire.
> >
> > What's the usefulness of this information?
> >
> > Thanks,
> >
> >         tglx
>
> Hi Thomas,
>
> Our goal is to sample what threads are doing in production with low
> frequency and low overhead. We did not find any reasonable existing
> way of doing it on Linux today, as outlined in the RFC version of the
> patch (other solutions are either much more complex and/or incur
> higher memory and/or CPU overheads):
> https://lore.kernel.org/all/20221216171807.760147-1-dvyukov@google.com/
>
> This sampling does not need to be 100% precise as CPU profilers would
> require, getting high precision generally requires more complexity and
> overheads. The accent is on use in production and low overhead.
> Consider we sample with O(seconds) interval, so some activities can
> still be unsampled whatever we do here (if they take <second). But on
> the other hand the intention is to use this over billions of CPU
> hours. So on the global scale we still observe more-or-less
> everything.
>
> Currently all signals are practically delivered to the main thread and
> the added test does not pass (at least I couldn't wait long enough).
> After this change the test passes quickly (within a second for me).
> Testing the actual distribution without flaky failures is very hard in
> unit tests. After rounds of complaints and deflaking they usually
> transform into roughly what this test is doing -- all threads are
> getting at least something.
> If we want to test ultimate fairness, we would need to start with the
> scheduler itself. If threads don't get fair fractions, then signals
> won't be evenly distributed as well. I am not sure if there are unit
> tests for the scheduler that ensure this in all configurations (e.g.
> uneven ratio of runnable threads to CPUs, running in VMs, etc).
> I agree this test is not perfect, but as I said, it does not pass now.
> So it is useful and will detect a future regression in this logic. It
> ensures that running threads eventually get signals.
>
> But regardless of our motivation, this change looks like an
> improvement in general. Consider performance alone (why would we wake
> another thread, maybe send an IPI, evict caches). Sending the signal
> to the thread that overflowed the counter also looks reasonable. For
> some programs it may actually give a good picture. Say thread A is
> running for a prolonged time, then thread B is running. The program
> will first get signals in thread A and then in thread B (instead of
> getting them on an unrelated thread).



Hi Thomas,

Has this answered your question? Do you have any other concerns?
If not, please take this into some tree for upstreamming.

Thanks