linux-kernel - Re: [PATCH v2] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e179c616-f427-769f-aa5b-058c63040015@redhat.com>
Date:   Mon, 7 Oct 2019 19:53:57 +0800
From:   lijiang <lijiang@...hat.com>
To:     Dave Young <dyoung@...hat.com>
Cc:     linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
        bp@...en8.de, hpa@...or.com, x86@...nel.org, bhe@...hat.com,
        jgross@...e.com, dhowells@...hat.com, Thomas.Lendacky@....com,
        ebiederm@...ssion.com, vgoyal@...hat.com, kexec@...ts.infradead.org
Subject: Re: [PATCH v2] x86/kdump: Fix 'kmem -s' reported an invalid
 freepointer when SME was active

在 2019年10月07日 17:33, Dave Young 写道:
> Hi Lianbo,
> On 10/07/19 at 03:08pm, Lianbo Jiang wrote:
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
>>
>> Kdump kernel will reuse the first 640k region because of some reasons,
>> for example: the trampline and conventional PC system BIOS region may
>> require to allocate memory in this area. Obviously, kdump kernel will
>> also overwrite the first 640k region, therefore, kernel has to copy
>> the contents of the first 640k area to a backup area, which is done in
>> purgatory(), because vmcore may need the old memory. When vmcore is
>> dumped, kdump kernel will read the old memory from the backup area of
>> the first 640k area.
>>
>> Basically, the main reason should be clear, kernel does not correctly
>> handle the first 640k region when SME is active, which causes that
>> kernel does not properly copy these old memory to the backup area in
>> purgatory(). Therefore, kdump kernel reads out the incorrect contents
>> from the backup area when dumping vmcore. Finally, the phenomenon is
>> as follow:
>>
>> [root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
>> WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values
>>
>>       KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
>>     DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
>>         CPUS: 128
>>         DATE: Thu Sep 19 08:31:18 2019
>>       UPTIME: 00:01:21
>> LOAD AVERAGE: 0.16, 0.07, 0.02
>>        TASKS: 1343
>>     NODENAME: amd-ethanol
>>      RELEASE: 5.3.0-rc7+
>>      VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
>>      MACHINE: x86_64  (2195 Mhz)
>>       MEMORY: 127.9 GB
>>        PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>>          PID: 9789
>>      COMMAND: "bash"
>>         TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
>>          CPU: 83
>>        STATE: TASK_RUNNING (PANIC)
>>
>> crash> kmem -s|grep -i invalid
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> crash>
>>
>> BTW: I also tried to fix the above problem in purgatory(), but there
>> are too many restricts in purgatory() context, for example: i can't
>> allocate new memory to create the identity mapping page table for SME
>> situation.
>>
>> Currently, there are two places where the first 640k area is needed,
>> the first one is in the find_trampoline_placement(), another one is
>> in the reserve_real_mode(), and their content doesn't matter. To avoid
>> the above error, lets occupy the remain memory of the first 640k region
>> (expect for the trampoline and real mode) so that the allocated memory
>> does not fall into the first 640k area when SME is active, which makes
>> us not to worry about whether kernel can correctly copy the contents of
>> the first 640k area to a backup region in the purgatory().
>>
>> Signed-off-by: Lianbo Jiang <lijiang@...hat.com>
>> ---
>> Changes since v1:
>> 1. Improve patch log
>> 2. Change the checking condition from sme_active() to sme_active()
>>    && strstr(boot_command_line, "crashkernel=")
>>
>>  arch/x86/kernel/setup.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
>> index 77ea96b794bd..bdb1a02a84fd 100644
>> --- a/arch/x86/kernel/setup.c
>> +++ b/arch/x86/kernel/setup.c
>> @@ -1148,6 +1148,9 @@ void __init setup_arch(char **cmdline_p)
>>  
>>  	reserve_real_mode();
>>  
>> +	if (sme_active() && strstr(boot_command_line, "crashkernel="))
>> +		memblock_reserve(0, 640*1024);
>> +
> 
> Seems you missed the comment about "unconditionally do it", only check
> crashkernel param looks better.
> 
If so, it means that copying the first 640k to a backup region is no longer needed, and
i should post a patch series to remove the copy_backup_region(). Any idea?

> Also I noticed reserve_crashkernel is called after initmem_init, I'm not
> sure if memblock_reserve is good enough in early code before
> initmem_init. 
>
The first zero page and real mode are also reserved before the initmem_init(),
and seems that they work well until now.

Thanks.
Lianbo

>>  	trim_platform_memory_ranges();
>>  	trim_low_memory_range();
>>  
>> -- 
>> 2.17.1
>>
> 
> Thanks
> Dave
>