linux-kernel - Re: [PATCH] mm: Recheck page table entry with page table lock held

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <a22a21d6-c872-63e9-77ec-8071bac9bfc9@linux.ibm.com>
Date:   Thu, 20 Sep 2018 16:41:59 +0530
From:   "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>
To:     "Kirill A. Shutemov" <kirill@...temov.name>
Cc:     akpm@...ux-foundation.org,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: Recheck page table entry with page table lock held

On 9/20/18 4:35 PM, Kirill A. Shutemov wrote:
> On Thu, Sep 20, 2018 at 02:54:08PM +0530, Aneesh Kumar K.V wrote:
>> We clear the pte temporarily during read/modify/write update of the pte. If we
>> take a page fault while the pte is cleared, the application can get SIGBUS. One
>> such case is with remap_pfn_range without a backing vm_ops->fault callback.
>> do_fault will return SIGBUS in that case.
> 
> It would be nice to show the path that clears pte temporarily.
> 
>> Fix this by taking page table lock and rechecking for pte_none.


we do that in the ptep_modify_prot_start/ptep_modify_prot_commit. Also 
in hugetlb_change_protection. The hugetlb case many not be relevant 
because that cannot be backed by a vma without vma->vm_ops.

What will hit this will be mprotect of a remap_pfn_range address?

>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@...ux.ibm.com>
>> ---
>>   mm/memory.c | 31 +++++++++++++++++++++++++++----
>>   1 file changed, 27 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index c467102a5cbc..c2f933184303 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -3745,10 +3745,33 @@ static vm_fault_t do_fault(struct vm_fault *vmf)
>>   	struct vm_area_struct *vma = vmf->vma;
>>   	vm_fault_t ret;
>>   
>> -	/* The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */
>> -	if (!vma->vm_ops->fault)
>> -		ret = VM_FAULT_SIGBUS;
>> -	else if (!(vmf->flags & FAULT_FLAG_WRITE))
>> +	/*
>> +	 * The VMA was not fully populated on mmap() or missing VM_DONTEXPAND
>> +	 */
>> +	if (!vma->vm_ops->fault) {
>> +
>> +		/*
>> +		 * pmd entries won't be marked none during a R/M/W cycle.
>> +		 */
>> +		if (unlikely(pmd_none(*vmf->pmd)))
>> +			ret = VM_FAULT_SIGBUS;
>> +		else {
>> +			vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
>> +			/*
>> +			 * Make sure this is not a temporary clearing of pte
>> +			 * by holding ptl and checking again. A R/M/W update
>> +			 * of pte involves: take ptl, clearing the pte so that
>> +			 * we don't have concurrent modification by hardware
>> +			 * followed by an update.
>> +			 */
>> +			spin_lock(vmf->ptl);
>> +			if (unlikely(pte_none(*vmf->pte)))
>> +				ret = VM_FAULT_SIGBUS;
>> +			else
>> +				ret = VM_FAULT_NOPAGE;
> 
> We return 0 if we did nothing in fault path.
> 

I didn't get that. If we find the pte not none, we return so that we 
retry the access. Are you suggesting VM_FAULT_NOPAGE is not the right 
return for that?

-aneesh