[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161202180343.gehqor7lgtmzwqq3@pd.tnic>
Date: Fri, 2 Dec 2016 19:03:43 +0100
From: Borislav Petkov <bp@...nel.org>
To: Andy Lutomirski <luto@...nel.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Anvin <hpa@...or.com>,
the arch/x86 maintainers <x86@...nel.org>,
One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
Borislav Petkov <bp@...en8.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Brian Gerst <brgerst@...il.com>,
Matthew Whitehead <tedheadster@...il.com>,
Henrique de Moraes Holschuh <hmh@....eng.br>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core()
implementation
On Fri, Dec 02, 2016 at 09:38:38AM -0800, Andy Lutomirski wrote:
> apply_alternatives, unfortunately. It's performance-critical because
> it's intensely stupid and does sync_core() for every single patch.
> Fixing that would be nice, too.
So I did experiment at the time to batch that sync_core() and interrupts
disabling in text_poke_early() but that amounted to almost nothing.
That's why I didn't bother chasing this further. I can try to dig out
that old thread...
/me goes and searches...
ah, here's something:
SNB:
* before:
[ 0.011707] SMP alternatives: Starting apply_alternatives...
[ 0.011784] SMP alternatives: done.
[ 0.011806] Freeing SMP alternatives: 20k freed
* after:
[ 0.011699] SMP alternatives: Starting apply_alternatives...
[ 0.011721] SMP alternatives: done.
[ 0.011743] Freeing SMP alternatives: 20k freed
-> 63 microseconds speedup
* kvm guest:
* before:
[ 0.017005] SMP alternatives: Starting apply_alternatives...
[ 0.019024] SMP alternatives: done.
[ 0.020119] Freeing SMP alternatives: 20k freed
* after:
[ 0.015008] SMP alternatives: Starting apply_alternatives...
[ 0.016029] SMP alternatives: done.
[ 0.017118] Freeing SMP alternatives: 20k freed
-> ~3 milliseconds speedup
---
I tried something simple like this:
---
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 1850592f4700..f4d5689ea503 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -253,10 +253,13 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
struct alt_instr *end)
{
struct alt_instr *a;
+ unsigned long flags;
u8 *instr, *replacement;
u8 insnbuf[MAX_PATCH_LEN];
DPRINTK("%s: alt table %p -> %p\n", __func__, start, end);
+
+ local_irq_save(flags);
/*
* The scan order should be from start to end. A later scanned
* alternative code can overwrite a previous scanned alternative code.
@@ -284,8 +287,10 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
add_nops(insnbuf + a->replacementlen,
a->instrlen - a->replacementlen);
- text_poke_early(instr, insnbuf, a->instrlen);
+ memcpy(instr, insnbuf, a->instrlen);
}
+ sync_core();
+ local_irq_restore(flags);
}
#ifdef CONFIG_SMP
---
I could try and redo the measurements again but I doubt it'll be any
different. Unless I haven't made a mistake somewhere...
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Powered by blists - more mailing lists