linux-kernel - Re: [PATCH 1/1] device-dax: Correct pgoff align in dax_set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0e574409-cf12-47fd-b107-664e7f1b9cb6@linux.alibaba.com>
Date: Sun, 29 Sep 2024 11:00:17 +0800
From: "Kun(llfl)" <llfl@...ux.alibaba.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, Dan Williams <dan.j.williams@...el.com>,
 Joao Martins <joao.m.martins@...cle.com>
Subject: Re: [PATCH 1/1] device-dax: Correct pgoff align in dax_set_mapping()

That's a subtle situation that only can be observed in 
page_mapped_in_vma() after the page is page fault handled by 
dev_dax_huge_fault. Generally, there is little chance to perform 
page_mapped_in_vma in dev-dax's page unless in specific error injection 
to the dax device to trigger an MCE - memory-failure. In that case, 
page_mapped_in_vma() will be triggered to determine which task is 
accessing the failure address and kill that task in the end.


We used self-developed dax device (which is 2M aligned mapping) , to 
perform error injection to random address. It turned out that error 
injected to non-2M-aligned address was causing endless MCE until panic. 
Because page_mapped_in_vma() kept resulting wrong address and the task 
accessing the failure address was never killed properly:


[ 3783.719419] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3784.049006] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3784.049190] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3784.448042] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3784.448186] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3784.792026] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3784.792179] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3785.162502] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3785.162633] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3785.461116] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3785.461247] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3785.764730] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3785.764859] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3786.042128] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3786.042259] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3786.464293] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3786.464423] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3786.818090] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3786.818217] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered
[ 3787.085297] mce: Uncorrected hardware memory error in user-access at 
200c9742380
[ 3787.085424] Memory failure: 0x200c9742: recovery action for dax page: 
Recovered

It took us several weeks to pinpoint this problem,  but we eventually 
used bpftrace to trace the page fault and mce address and successfully 
identified the issue.

On 9/28/24 1:46 AM, Andrew Morton wrote:
> (cc's added)
>
> On Fri, 27 Sep 2024 15:45:09 +0800 "Kun(llfl)" <llfl@...ux.alibaba.com> wrote:
>
>> pgoff should be aligned using ALIGN_DOWN() instead of ALIGN(). Otherwise,
>> vmf->address not aligned to fault_size will be aligned to the next
>> alignment, that can result in memory failure getting the wrong address.
>>
>> Fixes: b9b5777f09be ("device-dax: use ALIGN() for determining pgoff")
> That's quite an old change.  Can you suggest why it took this long to
> be discovered?  What is your userspace doing to trigger this?
>
>> Signed-off-by: Kun(llfl) <llfl@...ux.alibaba.com>
>> Tested-by: JianXiong Zhao <zhaojianxiong.zjx@...baba-inc.com>
>> ---
>>   drivers/dax/device.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/dax/device.c b/drivers/dax/device.c
>> index 9c1a729cd77e..6d74e62bbee0 100644
>> --- a/drivers/dax/device.c
>> +++ b/drivers/dax/device.c
>> @@ -86,7 +86,7 @@ static void dax_set_mapping(struct vm_fault *vmf, pfn_t pfn,
>>   		nr_pages = 1;
>>   
>>   	pgoff = linear_page_index(vmf->vma,
>> -			ALIGN(vmf->address, fault_size));
>> +			ALIGN_DOWN(vmf->address, fault_size));
>>   
>>   	for (i = 0; i < nr_pages; i++) {
>>   		struct page *page = pfn_to_page(pfn_t_to_pfn(pfn) + i);
>> -- 
>> 2.43.0

-- 
Best,
KUN