linux-kernel - Re: [PATCH] smp: add a best_effort version of smp_call_function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAMOZA0LxPLW2juGq2kZgrTT31L+rTq2LaAD17omGC++VHDOX2w@mail.gmail.com>
Date:   Tue, 20 Apr 2021 16:40:07 +0200
From:   Luigi Rizzo <lrizzo@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>, axboe@...nel.dk,
        paulmck@...nel.org
Subject: Re: [PATCH] smp: add a best_effort version of smp_call_function_many()

On Tue, Apr 20, 2021 at 3:33 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Apr 20, 2021 at 12:41:08PM +0200, Luigi Rizzo wrote:
> > On Tue, Apr 20, 2021 at 11:14 AM Peter Zijlstra <peterz@...radead.org> wrote:
...
> > My case too requires that the request is eventually handled, but with
> > this non-blocking IPI the caller has a better option than blocking:
> > it can either retry the multicast IPI at a later time if conditions allow,
> > or it can post a dedicated CSD (with the advantage that being my
> > requests idempotent, if the CSD is locked there is no need to retry
> > because it means the handler has not started yet).
> >
> > In fact, if we had the option to use dedicated CSDs for multicast IPI,
> > we wouldn't even need to retry because we'd know that the posted CSD
> > is for our call back and not someone else's.
>
> What are you doing that CSD contention is such a problem?

Basically what I said in a previous email: send a targeted interrupt to a
subset of the CPUs (large enough that the multicast IPI makes sense) so
they can start doing some work that has been posted for them.
Not too different from RFS, in a way.

The sender doesn't need (or want, obviously) to block, but occasional
O(100+us) stalls were clearly visible, and trivial to reproduce in tests
(e.g. when the process on the target CPU runs getrusage() and has
a very large number of threads, even if idle ones).

Even the _cond() version is not a sufficient to avoid the stall:
I could in principle use the callback to skip CPUs for which I
have a request posted and not processed yet, but if the csd
is in use by another pending IPI I have no alternative but spin.

cheers
luigi