linux-kernel - [PATCH] x86/paravirt: Match paravirt patchlet field definition ordering to initialization ordering

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190425081012.GA115378@gmail.com>
Date:   Thu, 25 Apr 2019 10:10:12 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        Juergen Gross <jgross@...e.com>,
        Andi Kleen <ak@...ux.intel.com>
Subject: [PATCH] x86/paravirt: Match paravirt patchlet field definition
 ordering to initialization ordering


* Thomas Gleixner <tglx@...utronix.de> wrote:

> On Thu, 25 Apr 2019, Ingo Molnar wrote:
> > > +# else
> > > +	.irq_restore_fl		= { 0x50, 0x9d },	// push %eax; popf
> > > +	.mmu_write_cr3		= { 0x0f, 0x22, 0xd8 },	// mov %eax, %cr3
> > > +	.cpu_iret		= { 0xcf },		// iret
> > > +# endif
> > 
> > I think these open-coded hexa versions are somewhat fragile as well, how 
> > about putting these into a .S file and controlling the sections in an LTO 
> > safe manner there?
> > 
> > That will also allow us to write proper asm, and global labels can be 
> > used to extract the patchlets and their length?
> 
> We are not changing these any other day and I really don't see a reason to
> have these things global just because.

Firstly, I'm not sure I understand the objection to using symbols: in an 
average distro kernel we have more than 75,000 symbols - as long as the 
namespace is clear they are there to be used, there's nothing wrong with 
them per se.

Secondly, my main point is that putting these code patching sequences 
into the .S it's easy to verify that the instructions and patchlets are 
what we claim them to be.

Right now they are randomly ordered data tables that don't match to the 
source code.

I know, because I tried. :-)

Here's the objdump -D output of the PATCH_XXL data table:

0000000000000010 <patch_data_xxl>:
  10:   fa                      cli    
  11:   fb                      sti    
  12:   57                      push   %rdi
  13:   9d                      popfq  
  14:   9c                      pushfq 
  15:   58                      pop    %rax
  16:   0f 20 d0                mov    %cr2,%rax
  19:   0f 20 d8                mov    %cr3,%rax
  1c:   0f 22 df                mov    %rdi,%cr3
  1f:   0f 09                   wbinvd 
  21:   0f 01 f8                swapgs 
  24:   48 0f 07                sysretq 
  27:   0f 01 f8                swapgs 
  2a:   48 89 f8                mov    %rdi,%rax

Note how this doesn't match up to the source code:

static const struct patch_xxl patch_data_xxl = {
        .irq_irq_disable        = { 0xfa },             // cli
        .irq_irq_enable         = { 0xfb },             // sti
        .irq_save_fl            = { 0x9c, 0x58 },       // pushf; pop %[re]ax
        .mmu_read_cr2           = { 0x0f, 0x20, 0xd0 }, // mov %cr2, %[re]ax
        .mmu_read_cr3           = { 0x0f, 0x20, 0xd8 }, // mov %cr3, %[re]ax
# ifdef CONFIG_X86_64
        .irq_restore_fl         = { 0x57, 0x9d },       // push %rdi; popfq
        .mmu_write_cr3          = { 0x0f, 0x22, 0xdf }, // mov %rdi, %cr3
        .cpu_wbinvd             = { 0x0f, 0x09 },       // wbinvd
        .cpu_usergs_sysret64    = { 0x0f, 0x01, 0xf8,
                                    0x48, 0x0f, 0x07 }, // swapgs; sysretq
        .cpu_swapgs             = { 0x0f, 0x01, 0xf8 }, // swapgs
        .mov64                  = { 0x48, 0x89, 0xf8 }, // mov %rdi, %rax
# else
        .irq_restore_fl         = { 0x50, 0x9d },       // push %eax; popf
        .mmu_write_cr3          = { 0x0f, 0x22, 0xd8 }, // mov %eax, %cr3
        .cpu_iret               = { 0xcf },             // iret
# endif
};

Note how they are reordered: in the generated code .irq_restore_fl comes 
before .irq_save_fl, etc. This is because the field ordering in struct 
patch_xxl does not match the initialization ordering of patch_data_xxl.

Third, beyond readability there's another advantage of my suggested 
approach as well: for example that way we could verify the passed in 
length with the patchlet length. Right now it's completely unverified:

        case PARAVIRT_PATCH(ops.m):                                     \
                return PATCH(data, ops##_##m, ibuf, len)

right now we don't check whether the 'len' passed in by the usage site 
matches the actual structure field length.

Although maybe we could do that with your C space structure as well.

Anyway, no strong feelings and I didn't want to create extra work for you 
- but I think at minimum we should do the patch below, which matches up 
the initialization order with the definition order - this at least makes 
the disassembly reviewable:

0000000000000010 <patch_data_xxl>:
  10:   fa                      cli    
  11:   fb                      sti    
  12:   9c                      pushfq 
  13:   58                      pop    %rax
  14:   0f 20 d0                mov    %cr2,%rax
  17:   0f 20 d8                mov    %cr3,%rax
  1a:   0f 22 df                mov    %rdi,%cr3
  1d:   57                      push   %rdi
  1e:   9d                      popfq  
  1f:   0f 09                   wbinvd 
  21:   0f 01 f8                swapgs 
  24:   48 0f 07                sysretq 
  27:   0f 01 f8                swapgs 
  2a:   48 89 f8                mov    %rdi,%rax

And yes, with that applied it verifies 100%. :-)

Thanks,

	Ingo

Signed-off-by: Ingo Molnar <mingo@...nel.org>

--- tip.orig/arch/x86/kernel/paravirt_patch.c
+++ tip/arch/x86/kernel/paravirt_patch.c
@@ -21,11 +21,11 @@
 struct patch_xxl {
 	const unsigned char	irq_irq_disable[1];
 	const unsigned char	irq_irq_enable[1];
-	const unsigned char	irq_restore_fl[2];
 	const unsigned char	irq_save_fl[2];
 	const unsigned char	mmu_read_cr2[3];
 	const unsigned char	mmu_read_cr3[3];
 	const unsigned char	mmu_write_cr3[3];
+	const unsigned char	irq_restore_fl[2];
 # ifdef CONFIG_X86_64
 	const unsigned char	cpu_wbinvd[2];
 	const unsigned char	cpu_usergs_sysret64[6];
@@ -43,16 +43,16 @@ static const struct patch_xxl patch_data
 	.mmu_read_cr2		= { 0x0f, 0x20, 0xd0 },	// mov %cr2, %[re]ax
 	.mmu_read_cr3		= { 0x0f, 0x20, 0xd8 },	// mov %cr3, %[re]ax
 # ifdef CONFIG_X86_64
-	.irq_restore_fl		= { 0x57, 0x9d },	// push %rdi; popfq
 	.mmu_write_cr3		= { 0x0f, 0x22, 0xdf },	// mov %rdi, %cr3
+	.irq_restore_fl		= { 0x57, 0x9d },	// push %rdi; popfq
 	.cpu_wbinvd		= { 0x0f, 0x09 },	// wbinvd
 	.cpu_usergs_sysret64	= { 0x0f, 0x01, 0xf8,
 				    0x48, 0x0f, 0x07 },	// swapgs; sysretq
 	.cpu_swapgs		= { 0x0f, 0x01, 0xf8 },	// swapgs
 	.mov64			= { 0x48, 0x89, 0xf8 },	// mov %rdi, %rax
 # else
-	.irq_restore_fl		= { 0x50, 0x9d },	// push %eax; popf
 	.mmu_write_cr3		= { 0x0f, 0x22, 0xd8 },	// mov %eax, %cr3
+	.irq_restore_fl		= { 0x50, 0x9d },	// push %eax; popf
 	.cpu_iret		= { 0xcf },		// iret
 # endif
 };