[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ae6c6566-4c9b-0547-c2e4-3df7cb2bed33@redhat.com>
Date: Fri, 17 Jun 2022 11:51:34 +0200
From: David Hildenbrand <david@...hat.com>
To: Alex Sierra <alex.sierra@....com>, jgg@...dia.com
Cc: Felix.Kuehling@....com, linux-mm@...ck.org, rcampbell@...dia.com,
linux-ext4@...r.kernel.org, linux-xfs@...r.kernel.org,
amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
hch@....de, jglisse@...hat.com, apopple@...dia.com,
willy@...radead.org, akpm@...ux-foundation.org
Subject: Re: [PATCH v5 02/13] mm: handling Non-LRU pages returned by
vm_normal_pages
On 31.05.22 22:00, Alex Sierra wrote:
> With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
> device-managed anonymous pages that are not LRU pages. Although they
> behave like normal pages for purposes of mapping in CPU page, and for
> COW. They do not support LRU lists, NUMA migration or THP.
>
> We also introduced a FOLL_LRU flag that adds the same behaviour to
> follow_page and related APIs, to allow callers to specify that they
> expect to put pages on an LRU list.
>
> Signed-off-by: Alex Sierra <alex.sierra@....com>
> Acked-by: Felix Kuehling <Felix.Kuehling@....com>
> ---
> fs/proc/task_mmu.c | 2 +-
> include/linux/mm.h | 3 ++-
> mm/gup.c | 6 +++++-
> mm/huge_memory.c | 2 +-
> mm/khugepaged.c | 9 ++++++---
> mm/ksm.c | 6 +++---
> mm/madvise.c | 4 ++--
> mm/memory.c | 9 ++++++++-
> mm/mempolicy.c | 2 +-
> mm/migrate.c | 4 ++--
> mm/mlock.c | 2 +-
> mm/mprotect.c | 2 +-
> 12 files changed, 33 insertions(+), 18 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 2d04e3470d4c..2dd8c8a66924 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -1792,7 +1792,7 @@ static struct page *can_gather_numa_stats(pte_t pte, struct vm_area_struct *vma,
> return NULL;
>
> page = vm_normal_page(vma, addr, pte);
> - if (!page)
> + if (!page || is_zone_device_page(page))
> return NULL;
>
> if (PageReserved(page))
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index bc8f326be0ce..d3f43908ff8d 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -601,7 +601,7 @@ struct vm_operations_struct {
> #endif
> /*
> * Called by vm_normal_page() for special PTEs to find the
> - * page for @addr. This is useful if the default behavior
> + * page for @addr. This is useful if the default behavior
> * (using pte_page()) would not find the correct page.
> */
> struct page *(*find_special_page)(struct vm_area_struct *vma,
> @@ -2934,6 +2934,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
> #define FOLL_NUMA 0x200 /* force NUMA hinting page fault */
> #define FOLL_MIGRATION 0x400 /* wait for page to replace migration entry */
> #define FOLL_TRIED 0x800 /* a retry, previous pass started an IO */
> +#define FOLL_LRU 0x1000 /* return only LRU (anon or page cache) */
Does that statement hold for special pages like the shared zeropage?
Also, this flag is only valid for in-kernel follow_page() but not for
the ordinary GUP interfaces. What are the semantics there? Is it fenced?
I really wonder if you should simply similarly teach the handful of
users of follow_page() to just special case these pages ... sounds
cleaner to me then adding flags with unclear semantics. Alternatively,
properly document what that flag is actually doing and where it applies.
I know, there was discussion on ... sorry for jumping in now, but this
doesn't look clean to me yet.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists