linux-kernel - Re: [GIT PULL] x86/asm changes for v5.6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200129183404.GB30979@zn.tnic>
Date:   Wed, 29 Jan 2020 19:34:04 +0100
From:   Borislav Petkov <bp@...e.de>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     "Luck, Tony" <tony.luck@...el.com>, Ingo Molnar <mingo@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] x86/asm changes for v5.6

On Wed, Jan 29, 2020 at 09:40:58AM -0800, Linus Torvalds wrote:
> On Wed, Jan 29, 2020 at 9:07 AM Luck, Tony <tony.luck@...el.com> wrote:
> >
> > This returns "3" ... not what we want. But swapping the ERMS/FSRM order
> > gets the correct version.
> 
> That actually makes sense, and is what I suspected (after I wrote the
> patch) what would happen. It would just be good to have it explicitly
> documented at the macro.

Like this?

---
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 13adca37c99a..d94bad03bcb4 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -164,6 +164,11 @@ static inline int alternatives_text_reserved(void *start, void *end)
 	ALTINSTR_REPLACEMENT(newinstr, feature, 1)			\
 	".popsection\n"
 
+/*
+ * The patching happens in the natural order of first newinstr1 and then
+ * newinstr2, iff the respective feature bits are set. See apply_alternatives()
+ * for details.
+ */
 #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2)\
 	OLDINSTR_2(oldinstr, 1, 2)					\
 	".pushsection .altinstructions,\"a\"\n"				\

> That would be bad indeed, but I don't think it should matter
> particularly for this case - it would have been bad before too.
> 
> I suspect there is some other issue going on, like the NOP padding
> logic being confused.

Or the cmp $0x20 test missing in the default case, see below.

> In particular, look here, this is the alt_instruction entries:
> 
>    altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f,142b-141b
>    altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f,142b-141b
> 
> where the arguments are "orig alt feature orig_len alt_len pad_len" in
> that order.
> 
> Notice how "pad_len" in both cases is the padding from the _original_
> instruction (142b-141b).

<snip this which I'll take a look later so that we can sort out the
issue at hand first>

> So I'm just hand-waving. Maybe there was some simpler explanation
> (like me just picking the wrong instructions when I did the rough
> conversion and simply breaking things with some stupid bug).

Looks like it. So I did this:

---
diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
index 7ff00ea64e4f..a670d01570df 100644
--- a/arch/x86/lib/memmove_64.S
+++ b/arch/x86/lib/memmove_64.S
@@ -39,23 +39,19 @@ SYM_FUNC_START(__memmove)
 	cmp %rdi, %r8
 	jg 2f
 
-	/* FSRM implies ERMS => no length checks, do the copy directly */
-.Lmemmove_begin_forward:
-	ALTERNATIVE "cmp $0x20, %rdx; jb 1f", "", X86_FEATURE_FSRM
-	ALTERNATIVE "", "movq %rdx, %rcx; rep movsb; retq", X86_FEATURE_ERMS
-
 	/*
-	 * movsq instruction have many startup latency
-	 * so we handle small size by general register.
-	 */
-	cmp  $680, %rdx
-	jb	3f
-	/*
-	 * movsq instruction is only good for aligned case.
+	 * Three rep-string alternatives:
+	 *  - go to "movsq" for large regions where source and dest are
+	 *    mutually aligned (same in low 8 bits). "label 4"
+	 *  - plain rep-movsb for FSRM
+	 *  - rep-movs for > 32 byte for ERMS.
 	 */
+.Lmemmove_begin_forward:
+	ALTERNATIVE_2 \
+		"cmp $0x20, %rdx; jb 1f; cmp $680, %rdx ; jb 3f ; cmpb %dil, %sil; je 4f", \
+		"cmp $0x20, %rdx; jb 1f; movq %rdx, %rcx; rep movsb; retq", X86_FEATURE_ERMS, \
+		"movq %rdx, %rcx ; rep movsb; retq", X86_FEATURE_FSRM
 
-	cmpb %dil, %sil
-	je 4f
 3:
 	sub $0x20, %rdx
 	/*

---

Notice how the *first* option of the alternative, which is the default
one, has that gotten that additional "cmp $0x20, %rdx; jb 1f" test which
sends us down to the less than 32 bytes copy length.

Your original version didn't have it and here's what I saw:

So I stopped the guest just before that code and the call trace looked
like this:

#0  memmove () at arch/x86/lib/memmove_64.S:43
#1  0xffffffff824448c2 in memblock_insert_region (type=0xffffffff824a8298 <memblock+56>, idx=<optimized out>, base=0, 
    size=4096, nid=2, flags=MEMBLOCK_NONE) at mm/memblock.c:553
#2  0xffffffff824454f0 in memblock_add_range (type=0xffffffff824a8298 <memblock+56>, base=0, size=<optimized out>, 
    nid=73400320, flags=<optimized out>) at mm/memblock.c:641
#3  0xffffffff82445627 in memblock_reserve (base=0, size=4096) at mm/memblock.c:830
#4  0xffffffff823ff399 in setup_arch (cmdline_p=0xffffffff82003f28) at arch/x86/kernel/setup.c:798
#5  0xffffffff823f9ae1 in start_kernel () at init/main.c:598
#6  0xffffffff810000d4 in secondary_startup_64 () at arch/x86/kernel/head_64.S:242
#7  0x0000000000000000 in ?? ()

and count in rdx was:

rdx            0x18     24

Without that "cmp $0x20" test above, we do the "cmp $680, %rdx; jb 3f;" test
and we run into the following asm at label 3:

3:
        sub $0x20, %rdx
        /*
         * We gobble 32 bytes forward in each loop.
         */

<--- right here %rdx is:

rdx            0xfffffffffffffff8       -8

and yeeehaaw, we're in the weeds and then end up triplefaulting at some
unmapped source address in %rsi or so.

So now I'm going to play all three variants with pen and paper to make
sure we're still sane.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg