linux-kernel - Re: Linux 6.11-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240731091148.GW33588@noisy.programming.kicks-ass.net>
Date: Wed, 31 Jul 2024 11:11:48 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Borislav Petkov <bp@...en8.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
	Guenter Roeck <linux@...ck-us.net>, Jens Axboe <axboe@...nel.dk>,
	Andy Lutomirski <luto@...nel.org>, Ingo Molnar <mingo@...hat.com>,
	Peter Anvin <hpa@...or.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	the arch/x86 maintainers <x86@...nel.org>
Subject: Re: Linux 6.11-rc1

On Wed, Jul 31, 2024 at 10:21:11AM +0200, Borislav Petkov wrote:
> On Tue, Jul 30, 2024 at 04:54:43PM -0700, Linus Torvalds wrote:
> > You also seemed to say that it only happened with some CPU selections.
> > Maybe there's something wrong with the ALTERNATIVE() cleanups - I'm
> > looking at that new "nested alternatives macros" thing, and the odd
> > games we play with the origin and replacement lengths etc.
> > 
> > That all looks entirely crazy. That file was hard to read before, now
> > it's just incomprehensible to me.
> 
> I'm sorry to hear that. The reason we did it is because it was starting to
> become really unwieldy to add a yet another alternative choice N in an
> ALTERNATIVE_N call...
> 
> Anyway, I'll try to reproduce here. In the meantime, can anyone who can
> reproduce - Guenter, Jens - boot that failing kernel with
> 
>   debug-alternative=-1
> 
> and copy dmesg and vmlinux somewhere for me?
> 
> It is a lot of output so make sure to catch it all.

So what I done instead is add: nokaslr to CMDLINE and -S -s to qemu and
am staring at the failing kernel in gdb.

So far all the alternatives in the affected paths look just fine.

Not that any of it is making sense, notably:

Code: bf 1e c2 e9 23 06 00 00 66 90 8d 76 00 fc 6a 00 68 f0 bd 1e c2 e9 11 06 00 00 8d 76 00 fc 6a 00 68 54 c5 1e c2 e9 01 06 00 00 <8d> 76 00 fc 68 b0 e9 1e c2 e9 f3 05 00 00 66 90 8d 76 00 fc 6a 00

decodes to:

   0:   bf 1e c2 e9 23          mov    $0x23e9c21e,%edi
   5:   06                      (bad)
   6:   00 00                   add    %al,(%rax)
   8:   66 90                   xchg   %ax,%ax
asm_exc_invalid_op:
   a:   8d 76 00                lea    0x0(%rsi),%esi
   d:   fc                      cld
   e:   6a 00                   push   $0x0
  10:   68 f0 bd 1e c2          push   $0xffffffffc21ebdf0
  15:   e9 11 06 00 00          jmp    0x62b
asm_exc_int3:
  1a:   8d 76 00                lea    0x0(%rsi),%esi
  1d:   fc                      cld
  1e:   6a 00                   push   $0x0
  20:   68 54 c5 1e c2          push   $0xffffffffc21ec554
  25:   e9 01 06 00 00          jmp    0x62b
asm_exc_page_fault:
  2a:*  8d 76 00                lea    0x0(%rsi),%esi           <-- trapping instruction
  2d:   fc                      cld
  2e:   68 b0 e9 1e c2          push   $0xffffffffc21ee9b0
  33:   e9 f3 05 00 00          jmp    0x62b
  38:   66 90                   xchg   %ax,%ax
asm_exc_machine_check:
  3a:   8d 76 00                lea    0x0(%rsi),%esi
  3d:   fc                      cld
  3e:   6a 00                   push   $0x0

And that trapping instruction is the CLAC nop (still a nop in the
faulting kernel image):

(gdb) disassemble asm_exc_page_fault
Dump of assembler code for function asm_exc_page_fault:
   0xc2200350 <+0>:     lea    0x0(%esi),%esi
   0xc2200353 <+3>:     cld
   0xc2200354 <+4>:     push   $0xc21ee9b0
   0xc2200359 <+9>:     jmp    0xc2200951 <handle_exception>
End of assembler dump.

And then we have the endless stream of:

  asm_exc_int3+0x10/0x10

which really is: asm_exc_page_fault+0x0/0x10, but that cannot be,
because then we'd have #DF much sooner.


The restore_all_switch_stack+0x65/0xe6 thing looks like so in the live
kernel image:

(gdb) disassemble restore_all_switch_stack
Dump of assembler code for function entry_INT80_32:
...
   0xc22008c5 <+353>:   mov    %cr3,%eax
   0xc22008c8 <+356>:   or     $0x1000,%eax
   0xc22008cd <+361>:   mov    %eax,%cr3
   0xc22008d0 <+364>:   mov    %esi,%esi		<--- here
   0xc22008d2 <+366>:   testl  $0x2,0x34(%esp)
   0xc22008da <+374>:   je     0xc22008e8 <entry_INT80_32+388>
   0xc22008dc <+376>:   mov    %cr3,%eax
   0xc22008df <+379>:   test   $0x1000,%eax
   0xc22008e4 <+384>:   jne    0xc22008e8 <entry_INT80_32+388>
   0xc22008e6 <+386>:   ud2
   0xc22008e8 <+388>:   pop    %ebx
...

So that is indeed BUG_IF_WRONG_CR3 and the JMP got patched to a NOP2.
Nothing strange there.


So yeah, no clue still.