lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <SN6PR02MB4157C677FDAD6507B443A8C7D40F2@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Wed, 17 Apr 2024 22:34:40 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Michael Schierl <schierlm@....de>, Jean DELVARE <jdelvare@...e.com>, "K.
 Y. Srinivasan" <kys@...rosoft.com>, Haiyang Zhang <haiyangz@...rosoft.com>,
	Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>
CC: "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: Early kernel panic in dmi_decode when running 32-bit kernel on
 Hyper-V on Windows 11

From: Michael Schierl <schierlm@....de> Sent: Wednesday, April 17, 2024 2:08 PM
> 
> > Don't let the type 10 distract you. It is entirely possible that the
> > byte corresponding to type == 10 is already part of the corrupted
> > memory area. Can you check if the DMI table generated by Hyper-V is
> > supposed to contain type 10 records at all?
> 
> How? Hyper-V is not open source :-)

I think that request from Jean is targeted to me or the Microsoft
people on the thread.  :-)

> 
> My best guess to get Linux out of the equation would be to boot my
> trusted MS-DOS 6.2 floppy and use debug.com to dump the DMI:
> 
> > | A:\>debug
> > | -df000:93d0 [to inspect]
> > | -nfromdos.dmi
> > | -rcx
> > | CX 0000
> > | :439B
> > | -w f000:93d0
> > | -q
> 
> 
> The result is byte-for-byte identical to the DMI dump I created from
> sysfs and pasted earlier in this thread. Of course, it does not have to
> be identical to the memory situation while it was parsed.

I've been looking at the details of the DMI blob in a Linux VM on my
local Windows 11 laptop, as well as in a Generation 1 VM in the Azure
public cloud, which uses Hyper-V.   The overall size and layout
of the DMI blob appears to be the same in both cases.  The blob is
corrupted in the VM on the local laptop, but good in the Azure VM.

I was wondering how to check if the Linux bootloaders and grub
were somehow corrupting the DMI blob, but now you've
answered the question by running MS-DOS and dumping the
contents.  Excellent experiment!

I still want to understand why 32-bit Linux is taking an oops during
boot while 64-bit Linux does not.  During boot, I can see that 64-bit
Linux wanders through the corrupted part of the DMI blob and
looks at a lot of bogus entries before it gets back on track again.
But the bogus entries don't cause an oops.  Once I figure out
those details, we still have the corrupted DMI blob, and based on
your MS-DOS experiment, it's looking like Hyper-V created the
corrupted form.   I want to think more about how to debug that.

FWIW, in comparing the Azure VM with my local VM, it looks like
the corrupted entry is the first type 4 entry describing a CPU.

Michael Kelley

> 
> > You should also check the memory map (as displayed early at boot, so
> > near the top of dmesg) and verify that the DMI table is located in a
> > "reserved" memory area, so that area can't be used for memory
> > allocation.
> 
> The e820 memory map was included in the early printk output I posted
> earlier:
> 
> > [    0.000000] BIOS-provided physical RAM map:
> > [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff]
> usable
> > [    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff]
> reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff]
> reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007ffeffff] usable
> > [    0.000000] BIOS-e820: [mem 0x000000007fff0000-0x000000007fffefff] ACPI
> data
> > [    0.000000] BIOS-e820: [mem 0x000000007ffff000-0x000000007fffffff] ACPI NVS
> 
> And from the dmidecode I pasted earlier:
> 
> > Table at 0x000F93D0.
> 
> The size is 0x0000439B, so the last byte should be at 0x000FD76A, well
> inside the third i820 entry (the second reserved one) - and accessible
> even from DOS without requiring any extra effort.
> 
> > So the table starts at physical address 0xba135000, which is in the
> > following memory map segment:
> >
> > reserve setup_data: [mem 0x00000000b87b0000-0x00000000bb77dfff] reserved
> 
> Looks like UEFI, and well outside the 1MB range :-)
> 
> > If the whole DMI table IS located in a "reserved" memory area, it can
> > still get corrupted, but only by code which itself operates on data
> > located in a reserved memory area.
> 
> 
> > Both DMI tables are corrupted, but are they corrupted in the exact same
> > way?
> 
> At least the dumped tables are byte-for-byte identical on both OS
> flavors. And (as I tested above) byte-for-byte identical to a version
> dumped from MS-DOS.
> 
> 
> Regards,
> 
> 
> Michael


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ