lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9209043d-3240-105b-72a3-b4cd30f1b1f1@oracle.com>
Date:   Wed, 29 Aug 2018 10:24:44 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Jerome Glisse <jglisse@...hat.com>,
        Michal Hocko <mhocko@...nel.org>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Andrew Morton <akpm@...ux-foundation.org>,
        stable@...r.kernel.org
Subject: Re: [PATCH v6 1/2] mm: migration: fix migration of huge PMD shared
 pages

On 08/27/2018 06:46 AM, Jerome Glisse wrote:
> On Mon, Aug 27, 2018 at 09:46:45AM +0200, Michal Hocko wrote:
>> On Fri 24-08-18 11:08:24, Mike Kravetz wrote:
>>> Here is an updated patch which does as you suggest above.
>> [...]
>>> @@ -1409,6 +1419,32 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>>>  		subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
>>>  		address = pvmw.address;
>>>  
>>> +		if (PageHuge(page)) {
>>> +			if (huge_pmd_unshare(mm, &address, pvmw.pte)) {
>>> +				/*
>>> +				 * huge_pmd_unshare unmapped an entire PMD
>>> +				 * page.  There is no way of knowing exactly
>>> +				 * which PMDs may be cached for this mm, so
>>> +				 * we must flush them all.  start/end were
>>> +				 * already adjusted above to cover this range.
>>> +				 */
>>> +				flush_cache_range(vma, start, end);
>>> +				flush_tlb_range(vma, start, end);
>>> +				mmu_notifier_invalidate_range(mm, start, end);
>>> +
>>> +				/*
>>> +				 * The ref count of the PMD page was dropped
>>> +				 * which is part of the way map counting
>>> +				 * is done for shared PMDs.  Return 'true'
>>> +				 * here.  When there is no other sharing,
>>> +				 * huge_pmd_unshare returns false and we will
>>> +				 * unmap the actual page and drop map count
>>> +				 * to zero.
>>> +				 */
>>> +				page_vma_mapped_walk_done(&pvmw);
>>> +				break;
>>> +			}
>>
>> This still calls into notifier while holding the ptl lock. Either I am
>> missing something or the invalidation is broken in this loop (not also
>> for other invalidations).
> 
> mmu_notifier_invalidate_range() is done with pt lock held only the start
> and end versions need to happen outside pt lock.

Hi Jérôme (and anyone else having good understanding of mmu notifier API),

Michal and I have been looking at backports to stable releases.  If you look
at the v4.4 version of try_to_unmap_one(), it does not use the
mmu_notifier_invalidate_range_start/end interfaces. Rather, it uses the
mmu_notifier_invalidate_page(), passing in the address of the page it
unmapped.  This is done after releasing the ptl lock.  I'm not even sure if
this works for huge pages, as it appears some THP supporting code was added
to try_to_unmap_one() after v4.4.

But, we were wondering what mmu notifier interface to use in the case where
try_to_unmap_one() unmaps a shared pmd huge page as addressed in the patch
above.  In this case, a PUD sized area is effectively unmapped.  In the
code/patch above we have the invalidate range (start and end as well) take
the PUD sized area into account.

What would be the best mmu notifier interface to use where there are no
start/end calls?
Or, is the best solution to add the start/end calls as is done in later
versions of the code?  If that is the suggestion, has there been any change
in invalidate start/end semantics that we should take into account?

-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ