lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161202180343.gehqor7lgtmzwqq3@pd.tnic>
Date:   Fri, 2 Dec 2016 19:03:43 +0100
From:   Borislav Petkov <bp@...nel.org>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Anvin <hpa@...or.com>,
        the arch/x86 maintainers <x86@...nel.org>,
        One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
        Borislav Petkov <bp@...en8.de>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Brian Gerst <brgerst@...il.com>,
        Matthew Whitehead <tedheadster@...il.com>,
        Henrique de Moraes Holschuh <hmh@....eng.br>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core()
 implementation

On Fri, Dec 02, 2016 at 09:38:38AM -0800, Andy Lutomirski wrote:
> apply_alternatives, unfortunately.  It's performance-critical because
> it's intensely stupid and does sync_core() for every single patch.
> Fixing that would be nice, too.

So I did experiment at the time to batch that sync_core() and interrupts
disabling in text_poke_early() but that amounted to almost nothing.
That's why I didn't bother chasing this further. I can try to dig out
that old thread...

/me goes and searches...

ah, here's something:

SNB:
* before:
[    0.011707] SMP alternatives: Starting apply_alternatives...
[    0.011784] SMP alternatives: done.
[    0.011806] Freeing SMP alternatives: 20k freed

* after:
[    0.011699] SMP alternatives: Starting apply_alternatives...
[    0.011721] SMP alternatives: done.
[    0.011743] Freeing SMP alternatives: 20k freed

-> 63 microseconds speedup


* kvm guest:
* before:
[    0.017005] SMP alternatives: Starting apply_alternatives...
[    0.019024] SMP alternatives: done.
[    0.020119] Freeing SMP alternatives: 20k freed

* after:
[    0.015008] SMP alternatives: Starting apply_alternatives...
[    0.016029] SMP alternatives: done.
[    0.017118] Freeing SMP alternatives: 20k freed

-> ~3 milliseconds speedup

---

I tried something simple like this:

---
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 1850592f4700..f4d5689ea503 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -253,10 +253,13 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
                                         struct alt_instr *end)
 {
        struct alt_instr *a;
+       unsigned long flags;
        u8 *instr, *replacement;
        u8 insnbuf[MAX_PATCH_LEN];

        DPRINTK("%s: alt table %p -> %p\n", __func__, start, end);
+
+       local_irq_save(flags);
        /*
         * The scan order should be from start to end. A later scanned
         * alternative code can overwrite a previous scanned alternative code.
@@ -284,8 +287,10 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
                add_nops(insnbuf + a->replacementlen,
                         a->instrlen - a->replacementlen);

-               text_poke_early(instr, insnbuf, a->instrlen);
+               memcpy(instr, insnbuf, a->instrlen);
        }
+       sync_core();
+       local_irq_restore(flags);
 }

 #ifdef CONFIG_SMP
---

I could try and redo the measurements again but I doubt it'll be any
different. Unless I haven't made a mistake somewhere...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ