[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b099e061-32fc-4111-7836-b7cdcee88455@solarflare.com>
Date: Fri, 15 Feb 2019 17:21:17 +0000
From: Edward Cree <ecree@...arflare.com>
To: Nadav Amit <namit@...are.com>
CC: Josh Poimboeuf <jpoimboe@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [RFC PATCH v2 0/4] dynamic indirect call promotion
On 05/02/19 08:50, Nadav Amit wrote:
>> In cases where RCU cannot be used (e.g. because some callees need to RCU
>> synchronise), it might be possible to add a variant that uses
>> synchronize_rcu_tasks() when updating, but this series does not attempt this.
> I wonder why.
Mainly because I have yet to convince myself that it's the Right Thing.
Note also the following (from kernel/rcu/update.c):
/* * This is a very specialized primitive, intended only for a few uses in
* tracing and other situations requiring manipulation of function
* preambles and profiling hooks. The synchronize_rcu_tasks() function
* is not (yet) intended for heavy use from multiple CPUs. */
> This seems like an easy solution, and according to Josh, Steven
> Rostedt and the documentation appears to be valid.
Will it hurt performance, though, if we end up (say) having rcu-tasks-
based synchronisation for updates on every indirect call in the kernel?
(As would result from a plugin-based opt-out approach.)
> As I stated before, I think that the best solution is to use a GCC plugin,
> [...] Such a solution will not enable the calling code to be
> written in C and would require a plugin for each architecture.
I'm afraid I don't see why. If we use the static_calls infrastructure,
but then do a source-level transformation in the compiler plugin to turn
indirect calls into dynamic_calls, it should be possible to create an
opt-out system without any arch-specific code in the plugin (the arch-
specific stuff being all in the static_calls code).
Any reason that can't be done? (Note: I don't know much about GCC
internals, maybe there's something obvious that stops a plugin doing
things like that.)
> Feel free to try my code and give me feedback. I did not get a feedback on my
> last version. Is there a fundamental problem with my plugin? Did you try it
> and got bad results, or perhaps it did not build?
I didn't test your patches yet, because I was busy trying to get mine
working and ready to post (and also with unrelated work). But now that
that's done, next time I have cycles spare for indirect call stuff I
guess testing (and reviewing) your approach will be next on my list.
> Why do you prefer an approach
> which requires annotation of the callers, instead of something that is much
> more transparent?
I'm concerned about the overhead (in both time and memory) of running
learning on every indirect call site (including ones that aren't in a
hot-path, and ones which have such a wide variety of callees that
promotion really doesn't help) throughout the whole kernel. Also, an
annotating programmer knows the locking/rcu context and can thus tell
whether a given dynamic_call should use synchronise_rcu_tasks(),
synchronise_rcu(), or perhaps something else (if e.g. the call always
happens under a mutex, then the updater work could take that mutex).
The real answer, though, is that I don't so much prefer this approach,
as think that both should be tried "publicly" and evaluated by more
developers than just us three. There's a reason this series is
marked RFC ;-)
-Ed
Powered by blists - more mailing lists