[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bDo-9dP4JZeVscE65dhkJ9jPKk+0_6v0vQXTCM3m0J1DQ@mail.gmail.com>
Date: Tue, 19 Jan 2021 15:48:11 -0500
From: Pavel Tatashin <pasha.tatashin@...een.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: LKML <linux-kernel@...r.kernel.org>, linux-mm <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Michal Hocko <mhocko@...e.com>,
David Hildenbrand <david@...hat.com>,
Oscar Salvador <osalvador@...e.de>,
Dan Williams <dan.j.williams@...el.com>,
Sasha Levin <sashal@...nel.org>,
Tyler Hicks <tyhicks@...ux.microsoft.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>, mike.kravetz@...cle.com,
Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Matthew Wilcox <willy@...radead.org>,
David Rientjes <rientjes@...gle.com>,
John Hubbard <jhubbard@...dia.com>,
Linux Doc Mailing List <linux-doc@...r.kernel.org>,
Ira Weiny <ira.weiny@...el.com>,
linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v5 08/14] mm/gup: do not allow zero page for pinned pages
> I was thinking about a use case where userland would pin an address
> without FOLL_WRITE, because the PTE for that address is not going to
> be writable, but some device via DMA will write to it. Now, if we got
> a zero page we have a problem... If this usecase is not valid then the
> fix for movable zero page is make the zero page always come from a
> non-movable zone so we do not need to isolate it during migration, and
> so the memory can be offlined later.
I looked into making zero_page non-movable, and I am confused here.
huge zero page is already not movable:
get_huge_zero_page()
zero_page = alloc_pages((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE, ...
Base zero page can be in a movable zone, which is a bug: if there are
references to zero page, that page cannot be migrated, and we won't be
hot-remove memory area where that page is located. On x86, zero page
should always come from the bottom 4G of physical memory / DMA32 ZONE.
However, I see that sometimes it is not (I reproduce in QEMU emulator):
QEMU instance with 16G of memory and kernelcore=5G
Boot#1:
zero_pfn 48a8d
zero_pfn zone: ZONE_DMA32
Boot#2:
zero_pfn 20168d
zero_pfn zone: ZONE_MOVABLE (???)
The problem is that the x86 zero page comes from the .bss segment:
https://soleen.com/source/xref/linux/arch/x86/kernel/head_64.S?r=31d85460#583
Which, I thought would always be set within the first 4G of physical
memory. What is going on here?
Pasha
Powered by blists - more mailing lists