[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150304090038.GB3233@pd.tnic>
Date: Wed, 4 Mar 2015 10:00:38 +0100
From: Borislav Petkov <bp@...en8.de>
To: Ingo Molnar <mingo@...nel.org>
Cc: X86 ML <x86@...nel.org>, Andy Lutomirski <luto@...capital.net>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v2 07/15] x86/lib/copy_user_64.S: Convert to ALTERNATIVE_2
On Wed, Mar 04, 2015 at 07:25:52AM +0100, Ingo Molnar wrote:
> Btw., as a future optimization, wouldn't it be useful to patch this
> function at its first instruction, i.e. to have three fully functional
> copy_user_generic_ variants and choose to jmp to one of them in the
> first instruction of the original function?
Yeah, we already have callsites doing that:
old insn VA: 0xffffffff8100e8b3, CPU feat: X86_FEATURE_REP_GOOD, size: 5
xfpregs_get:
ffffffff8100e8b3: e8 f8 40 2b 00 callq ffffffff812c29b0
repl insn: 0xffffffff81e1c255, size: 5
ffffffff81e1c255: e8 16 68 4a ff callq ffffffff812c2a70
ffffffff8100e8b3: e8 16 68 4a ff callq ffffffff804b50ce
old insn VA: 0xffffffff8100e8b3, CPU feat: X86_FEATURE_ERMS, size: 5
xfpregs_get:
ffffffff8100e8b3: e8 f8 40 2b 00 callq ffffffff812c29b0
repl insn: 0xffffffff81e1c25a, size: 5
ffffffff81e1c25a: e8 51 68 4a ff callq ffffffff812c2ab0
ffffffff8100e8b3: e8 51 68 4a ff callq ffffffff804b5109
That's copy_user_generic() which does the alternative_call_2().
And yes, it is on the TODO list to optimize those other alternatives
sites.
> The advantage would be two-fold:
>
> 1) right now: smart microarchitectures that are able to optimize
> jump-after-jump (and jump-after-call) targets in their branch
> target cache can do so in this case, reducing the overhead of the
> patching, possibly close to zero in the cached case.
Yeah, it would still be better to simply do CALL and no unconditional
JMPs in there later. But this is future work, one thing at a time :)
> 2) in the future: we could actually do a (limited) re-link of the
> kernel during bootup, and patch up the original copy_to_user call
> sites directly to one of the three variants. Alternatives patching
> done at the symbol level. Does current tooling allow something
> like this already?
Well, I have a patchset which uses relocs to patch vmlinux at build
time. And that was the initial approach to this but you cannot know
which features a CPU supports until boot time so you have to boot.
BUT(!), you can replace stuff like X86_FEATURE_ALWAYS at build time
already (this is the static_cpu_has_safe() stuff). I'll look into that
later and dust off my relocs pile.
Thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists