linux-kernel - Re: [patch 5/9] x86: Cure per CPU madness on UP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <86796005-f1d9-4c8c-80d8-f1f88ca220ba@roeck-us.net>
Date: Thu, 21 Mar 2024 07:06:53 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
 Linus Torvalds <torvalds@...uxfoundation.org>,
 Uros Bizjak <ubizjak@...il.com>, linux-sparse@...r.kernel.org,
 lkp@...el.com, oe-kbuild-all@...ts.linux.dev
Subject: Re: [patch 5/9] x86: Cure per CPU madness on UP

On 3/21/24 04:14, Thomas Gleixner wrote:
> On Wed, Mar 20 2024 at 08:46, Guenter Roeck wrote:
>> On 3/20/24 01:58, Thomas Gleixner wrote:
>>> On Fri, Mar 15 2024 at 09:17, Guenter Roeck wrote:
>>>> I don't know the code well enough to determine what is wrong.
>>>> Please let me know what I can do to help debugging the problem.
>>>
>>> Could you provide me the config and the qemu command line?
>>>
>>
>> defconfig-CONFIG_SMP and
>>
>> qemu-system-x86_64 -kernel arch/x86/boot/bzImage -cpu Haswell \
>>        --append "console=ttyS0" -nographic -monitor none
>>
>> The cpu doesn't really matter as long as it is an Intel CPU.
>> A root file system isn't needed since the boot doesn't get that far.
> 
> Now it get's interesting because I can't reproduce it with that setup at
> all.
> 
> What's weird is that I saw it exactly once on 64-bit in a VM with a UP
> config two days ago, but when I started to add instrumentation it never
> came back even after backing the instrumentation changes out. I have
> seriously no idea what's going on there.
> 
> Is it fully reproducible on your side?
> 

Yes, always.

> If so can you please provide a full dmesg and then apply the patch below
> and provide the resulting full dmesg too?
> 

You'll find everything at http://server.roeck-us.net/qemu/x86-nosmp/

The crash is gone after applying your patch. The difference is:

+       /*
+        * If there was no APIC registered, then the map check below would
+        * fail. With no APIC this is guaranteed to be an UP system and
+        * therefore all topology levels have only one entry and their
+        * logical ID is obviously 0.
+        */
+       if (topo_info.boot_cpu_apic_id == BAD_APICID) {
+               pr_info("#### topo_info.boot_cpu_apic_id == BAD_APICID\n");
                 ^^^^ I added this
+               return 0;
+       }
+

I see the "#### topo_info.boot_cpu_apic_id == BAD_APICID" message
twice in the log. See patched.log at the page pointed to above.

Hope the helps,
Guenter