[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2d3c7ab8-0b83-4ef5-bb89-0c7c476265b3@redhat.com>
Date: Sun, 11 Aug 2019 10:29:57 +0800
From: lijiang <lijiang@...hat.com>
To: "Lendacky, Thomas" <Thomas.Lendacky@....com>,
Dave Young <dyoung@...hat.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Dave Anderson <anderson@...hat.com>,
"kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
"vgoyal@...hat.com" <vgoyal@...hat.com>,
"bhe@...hat.com" <bhe@...hat.com>,
"ebiederm@...ssion.com" <ebiederm@...ssion.com>
Subject: Re: crash: `kmem -s` reported "kmem: dma-kmalloc-512: slab:
ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e" on a dumped vmcore
在 2019年08月09日 06:37, Lendacky, Thomas 写道:
> On 8/1/19 8:05 PM, Dave Young wrote:
>> Add kexec cc list.
>> On 08/01/19 at 11:02pm, lijiang wrote:
>>> Hi, Tom
>>>
>>> Recently, i ran into a problem about SME and used crash tool to check the vmcore as follow:
>>>
>>> crash> kmem -s | grep -i invalid
>>> kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e
>>> kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e
>>>
>>> And the crash tool reported the above error, probably, the main reason is that kernel does not
>>> correctly handle the first 640k region when SME is enabled.
>>>
>>> When SME is enabled, the kernel and initramfs images are loaded into the decrypted memory, and
>>> the backup area(first 640k) is also mapped as decrypted, but the first 640k data is copied to
>>> the backup area in purgatory(). Please refer to this file: arch/x86/purgatory/purgatory.c
>>> ......
>>> static int copy_backup_region(void)
>>> {
>>> if (purgatory_backup_dest) {
>>> memcpy((void *)purgatory_backup_dest,
>>> (void *)purgatory_backup_src, purgatory_backup_sz);
>>> }
>>> return 0;
>>> }
>>> ......
>>>
>>> arch/x86/kernel/machine_kexec_64.c
>>> ......
>>> machine_kexec_prepare()->
>>> arch_update_purgatory()->
>>> .....
>>>
>>> Actually, the firs 640k area is encrypted in the first kernel when SME is enabled, here kernel
>>> copies the first 640k data to the backup area in purgatory(), because the backup area is mapped
>>> as decrypted, this copying operation makes that the first 640k data is decrypted(decoded) and
>>> saved to the backup area, but probably kernel can not aware of SME in purgatory(), which causes
>>> kernel mistakenly read out the first 640k.
>>>
>>> In addition, i hacked kernel code as follow:
>>>
>>> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
>>> index 7bcc92add72c..a51631d36a7a 100644
>>> --- a/fs/proc/vmcore.c
>>> +++ b/fs/proc/vmcore.c
>>> @@ -377,6 +378,16 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
>>> m->offset + m->size - *fpos,
>>> buflen);
>>> start = m->paddr + *fpos - m->offset;
>>> + if (m->paddr == 0x73f60000) {//the backup area's start address:0x73f60000
>>> + tmp = read_from_oldmem(buffer, tsz, &start,
>>> + userbuf, false);
>>> + } else
>>> tmp = read_from_oldmem(buffer, tsz, &start,
>>> userbuf, mem_encrypt_active());
>>> if (tmp < 0)
>>>
>>> Here, i used the crash tool to check the vmcore, i can see that the backup area is decrypted,
>>> except for the dma-kmalloc-512. So i suspect that kernel did not correctly read out the first
>>> 640k data to backup area. Do you happen to know how to deal with the first 640k area in purgatory()
>>> when SME is enabled? Any idea?
>
> I'm not all that familiar with kexec and purgatory, etc., but I think
> that you want to setup the page table that is active when purgatory runs
> so that the src and dest both have the SME encryption mask set in their
> respective page table entries. This way, when the copy is performed,
> everything is copied correctly.
Exactly. That's just what i was thinking.
> Remember, encrypted data from one page
> cannot be directly copied as unencrypted data and decrypted properly in
> the new location (e.g. a page of zeroes encrypted at one address will not
> appear the same as a page of zeroes encrypted at a different address).
Yes, that's right. Thank you, Tom.
I'm considering how to solve it, and i guess that probably it needs to properly deal with
this problem in purgatory().
Thanks.
Lianbo
>
> Thanks,
> Tom
>
>>>
>>> BTW: I' curious the reason why the address of dma-kmalloc-512k always falls into the first 640k
>>> region, and i did not see the same issue on another machine.
>>>
>>> Machine:
>>> Serial Number diesel-sys9079-0001
>>> Model AMD Diesel (A0C)
>>> CPU AMD EPYC 7601 32-Core Processor
>>>
>>>
>>> Background:
>>> On x86_64, the first 640k region is special because of some historical reasons. And kdump kernel will
>>> reuse the first 640k region, so kernel will back up(copy) the first 640k region to a backup area in
>>> purgatory(), in order not to rewrite the old region(640k) in kdump kernel, which makes sure that kdump
>>> can read out the old memory from vmcore.
>>>
>>>
>>> Thanks.
>>> Lianbo
Powered by blists - more mailing lists