lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <657d2356-126d-452b-ba7f-5c0761f4f832@gmail.com>
Date: Thu, 4 Dec 2025 19:27:29 +0000
From: Usama Arif <usamaarif642@...il.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>,
 Andrew Morton <akpm@...ux-foundation.org>, kas@...nel.org,
 changyuanl@...gle.com, graf@...zon.com, leitao@...ian.org, thevlad@...a.com,
 pratyush@...nel.org, dave.hansen@...ux.intel.com, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH v3 2/2] mm/memblock: only mark/clear KHO scratch memory
 when needed



On 04/12/2025 17:52, Mike Rapoport wrote:
> Hi Usama,
> 
> On Thu, Dec 04, 2025 at 02:51:00PM +0000, Usama Arif wrote:
>>> On Sun, Nov 30, 2025 at 3:52 AM Mike Rapoport <rppt@...nel.org> wrote:
>>>>
>>>> On Fri, Nov 28, 2025 at 05:29:34PM +0000, Usama Arif wrote:
>>>>> The scratch memory for kexec handover is used to bootstrap the
>>>>> kexec'ed kernel. Only the 1st 1MB is used as scratch, and its a
>>>>> hack to get around limitations with KHO. It is only needed when
>>>>> CONFIG_KEXEC_HANDOVER is enabled and only if it is a KHO boot
>>>>> (both checked by is_kho_boot). Add check to prevent marking a KHO
>>>>> scratch region unless needed.
>>>>
>>>> I'm going to rewrite the changelog and queue this for upstream:
>>>>
>>>> The scratch memory for kexec handover is used to bootstrap the kexec'ed
>>>> kernel and it is only needed when it is a KHO boot, i.e. a kexec boot with
>>>> handover data passed from the previous kernel.
>>>>
>>>> Currently x86 marks the first megabyte of memory as KHO scratch even for
>>>> non-KHO boots if CONFIG_KEXEC_HANDOVER is enabled.
>>>>
>>>> Add check to prevent marking a KHO scratch regions unless they are actually
>>>> needed.
>>>>
>>>>> Fixes: a2daf83e10378 ("x86/e820: temporarily enable KHO scratch for memory below 1M")
>>>>> Reported-by: Vlad Poenaru <thevlad@...a.com>
>>>>> Signed-off-by: Usama Arif <usamaarif642@...il.com>
>>>>> Reviewed-by: Pratyush Yadav <pratyush@...nel.org>
>>>
>>> This patch causes panic with my tests in linux-next.
>>>
>>> [    0.000000] Kernel panic - not syncing: Cannot allocate 17280 bytes
>>> for node 0 data
>>> [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted
>>> 6.18.0-next-20251203 #2 PREEMPT(undef)
>>> [    0.000000] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
>>> BIOS 0.1 11/11/2019
>>> [    0.000000] Call Trace:
>>> [    0.000000]  <TASK>
>>> [    0.000000]  ? dump_stack_lvl+0x4e/0x70
>>> [    0.000000]  ? vpanic+0xcf/0x2b0
>>> [    0.000000]  ? panic+0x66/0x66
>>> [    0.000000]  ? alloc_node_data+0x32/0x90
>>> [    0.000000]  ? numa_register_nodes+0x82/0x100
>>> [    0.000000]  ? numa_init+0x36/0x120
>>> [    0.000000]  ? setup_arch+0x667/0x7f0
>>> [    0.000000]  ? start_kernel+0x58/0x640
>>> [    0.000000]  ? x86_64_start_reservations+0x24/0x30
>>> [    0.000000]  ? x86_64_start_kernel+0xc5/0xd0
>>> [    0.000000]  ? common_startup_64+0x13e/0x148
>>> [    0.000000]  </TASK>
>>> [    0.000000] ---[ end Kernel panic - not syncing: Cannot allocate
>>> 17280 bytes for node 0 data ]---
>>> PANIC: early exception 0x0d IP 10:ffffffff89007a13 error 763 cr2
>>> 0xffff991090a01000
>>>
>>
>> Thanks for reporting this and sorry for the bug!
>>
>> So the patch was designed to remove the memblock_mark_kho_scratch in e820__memblock_setup if not
>> in KHO boot. But it broke memblock_mark_kho_scratch in kho_populate.
>> Moving kho_in.fdt_phys = fdt_phys to before the memblock_mark_scratch
>> should fix it. I dont have a setup where I can easily test KHO, but I think below
>> should fix it?
> 
> This might, but this is too late for v6.19-rc1.
> For now I'm dropping this series from memblock/for-next.
> We can resume working on this after merge window closes.
>  

Yes makes sense.

How would you like me to proceed with the fix? Should I send just the fix now,
or these 2 patches plus the fix after the merge window closes?

Thanks!


>> TBH using fdt_phys to check if the boot is KHO might be a bit hacky? Is it possible
>> to have a better check for this?
> 
> Presence of KHO FDT is a clear indication that it is a KHO boot.
> The issue is that during early boot ordering is hard and it's not always
> clear in which order features and configuration are detected and used. 
>  

ack

>> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
>> index 9dc51fab604f1..c331749e6452e 100644
>> --- a/kernel/liveupdate/kexec_handover.c
>> +++ b/kernel/liveupdate/kexec_handover.c
>> @@ -1483,6 +1483,7 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
>>                 goto out;
>>         }
>>  
>> +       kho_in.fdt_phys = fdt_phys;
>>         /*
>>          * We pass a safe contiguous blocks of memory to use for early boot
>>          * purporses from the previous kernel so that we can resize the
>> @@ -1513,7 +1514,6 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
>>          */
>>         memblock_set_kho_scratch_only();
>>  
>> -       kho_in.fdt_phys = fdt_phys;
>>         kho_in.scratch_phys = scratch_phys;
>>         kho_scratch_cnt = scratch_cnt;
>>         pr_info("found kexec handover data.\n");
>> @@ -1524,7 +1524,10 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
>>         if (scratch)
>>                 early_memunmap(scratch, scratch_len);
>>         if (err)
>> +       {
>> +               kho_in.fdt_phys = 0;
>>                 pr_warn("disabling KHO revival: %d\n", err);
>> +       }
>>  }
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ