[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230802152847.c3pz5o4pfsmkuv3u@techsingularity.net>
Date: Wed, 2 Aug 2023 16:28:47 +0100
From: Mel Gorman <mgorman@...hsingularity.net>
To: David Hildenbrand <david@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-fsdevel@...r.kernel.org, kvm@...r.kernel.org,
linux-kselftest@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
liubo <liubo254@...wei.com>, Peter Xu <peterx@...hat.com>,
Matthew Wilcox <willy@...radead.org>,
Hugh Dickins <hughd@...gle.com>,
Jason Gunthorpe <jgg@...pe.ca>,
John Hubbard <jhubbard@...dia.com>,
Mel Gorman <mgorman@...e.de>, Shuah Khan <shuah@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH v2 4/8] mm/gup: don't implicitly set FOLL_HONOR_NUMA_FAULT
On Tue, Aug 01, 2023 at 02:48:40PM +0200, David Hildenbrand wrote:
> Commit 0b9d705297b2 ("mm: numa: Support NUMA hinting page faults from
> gup/gup_fast") from 2012 documented as the primary reason why we would want
> to handle NUMA hinting faults from GUP:
>
> KVM secondary MMU page faults will trigger the NUMA hinting page
> faults through gup_fast -> get_user_pages -> follow_page ->
> handle_mm_fault.
>
> That is still the case today, and relevant KVM code has been converted to
> manually set FOLL_HONOR_NUMA_FAULT. So let's stop setting
> FOLL_HONOR_NUMA_FAULT for all GUP users and cross fingers that not that
> many other ones that really require such handling for autonuma remain.
>
> Possible interaction with MMU notifiers:
>
> Assume a driver obtains a page using get_user_pages() to map it into
> a secondary MMU, and uses the MMU notifier framework to get notified on
> changes.
>
> Assume get_user_pages() succeeded on a PROT_NONE-mapped page (because
> FOLL_HONOR_NUMA_FAULT is not set) in an accessible VMA and the page is
> mapped into a secondary MMU. Once user space would turn that mapping
> inaccessible using mprotect(PROT_NONE), the actual PTE in the page table
> might not change. If the MMU notifier would be smart and optimize for that
> case "why notify if the PTE didn't change", that could be problematic.
>
> At least change_pmd_range() with MMU_NOTIFY_PROTECTION_VMA for now does an
> unconditional mmu_notifier_invalidate_range_start() ->
> mmu_notifier_invalidate_range_end() and should be fine.
>
> Note that even if a PTE in an accessible VMA is pte_protnone(), the
> underlying page might be accessed by a secondary MMU that does not set
> FOLL_HONOR_NUMA_FAULT, and test_young() MMU notifiers would return "true".
>
> Signed-off-by: David Hildenbrand <david@...hat.com>
Also seems sane but a large portion of its correctness also depends on
patch 3 being correct.
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists