linux-kernel - Re: [RFC PATCH v2 0/4] dynamic indirect call promotion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b099e061-32fc-4111-7836-b7cdcee88455@solarflare.com>
Date:   Fri, 15 Feb 2019 17:21:17 +0000
From:   Edward Cree <ecree@...arflare.com>
To:     Nadav Amit <namit@...are.com>
CC:     Josh Poimboeuf <jpoimboe@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>
Subject: Re: [RFC PATCH v2 0/4] dynamic indirect call promotion

On 05/02/19 08:50, Nadav Amit wrote:
>> In cases where RCU cannot be used (e.g. because some callees need to RCU
>> synchronise), it might be possible to add a variant that uses
>> synchronize_rcu_tasks() when updating, but this series does not attempt this.
> I wonder why.
Mainly because I have yet to convince myself that it's the Right Thing.
Note also the following (from kernel/rcu/update.c):

/* * This is a very specialized primitive, intended only for a few uses in
* tracing and other situations requiring manipulation of function
* preambles and profiling hooks. The synchronize_rcu_tasks() function
* is not (yet) intended for heavy use from multiple CPUs.  */

> This seems like an easy solution, and according to Josh, Steven
> Rostedt and the documentation appears to be valid.
Will it hurt performance, though, if we end up (say) having rcu-tasks-
 based synchronisation for updates on every indirect call in the kernel?
(As would result from a plugin-based opt-out approach.)

> As I stated before, I think that the best solution is to use a GCC plugin,
> [...] Such a solution will not enable the calling code to be
> written in C and would require a plugin for each architecture.
I'm afraid I don't see why.  If we use the static_calls infrastructure,
 but then do a source-level transformation in the compiler plugin to turn
 indirect calls into dynamic_calls, it should be possible to create an
 opt-out system without any arch-specific code in the plugin (the arch-
 specific stuff being all in the static_calls code).
Any reason that can't be done?  (Note: I don't know much about GCC
 internals, maybe there's something obvious that stops a plugin doing
 things like that.)

> Feel free to try my code and give me feedback. I did not get a feedback on my
> last version. Is there a fundamental problem with my plugin? Did you try it 
> and got bad results, or perhaps it did not build?
I didn't test your patches yet, because I was busy trying to get mine
 working and ready to post (and also with unrelated work).  But now that
 that's done, next time I have cycles spare for indirect call stuff I
 guess testing (and reviewing) your approach will be next on my list.

> Why do you prefer an approach
> which requires annotation of the callers, instead of something that is much
> more transparent?
I'm concerned about the overhead (in both time and memory) of running
 learning on every indirect call site (including ones that aren't in a
 hot-path, and ones which have such a wide variety of callees that
 promotion really doesn't help) throughout the whole kernel.  Also, an
 annotating programmer knows the locking/rcu context and can thus tell
 whether a given dynamic_call should use synchronise_rcu_tasks(),
 synchronise_rcu(), or perhaps something else (if e.g. the call always
 happens under a mutex, then the updater work could take that mutex).

The real answer, though, is that I don't so much prefer this approach,
 as think that both should be tried "publicly" and evaluated by more
 developers than just us three.  There's a reason this series is
 marked RFC ;-)


-Ed