[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f996f570-eed9-509f-553c-280a62dc6d20@redhat.com>
Date: Wed, 17 Mar 2021 15:35:41 +0100
From: David Hildenbrand <david@...hat.com>
To: Oscar Salvador <osalvador@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...nel.org>,
Anshuman Khandual <anshuman.khandual@....com>,
Vlastimil Babka <vbabka@...e.cz>,
Pavel Tatashin <pasha.tatashin@...een.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 1/5] mm,memory_hotplug: Allocate memmap from the added
memory range
On 17.03.21 15:08, Oscar Salvador wrote:
> On Tue, Mar 16, 2021 at 06:45:17PM +0100, David Hildenbrand wrote:
>>> I find that cross reference to vmemmap code a little hard to digest.
>>> I would have assume that we don't have to care about PMDs in this
>>> code here at all. The vmemmap population code should handle that.
>>>
>>> I think I already mentioned that somewhere, I think it should be like this:
>>>
>>> a) vmemmap code should *never* populate more memory than requested for
>>> a single memory section when we are populating from the altmap.
>>> If that cannot be guaranteed for PMDs, then we have to fallback
>>> to populating base pages. Populating PMDs from an altmap with
>>> sizeof(struct page) == 64 is highly dangerous.
>
> I guess you meant sizeof(struct page) != 64
>
Yes!
> But other usecases of using altmap (ZONE_DEVICE stuff) might not care whether
> they have sub-populated PMDs when populating sections from altmap?
Just assume you have two ranges
[ ZONE_DEVICE 0 ][ ZONE_DEVICE 1]
If the vmemmap of ZONE_DEVICE 1 could be taken from the altmap of
ZONE_DEVICE 0, we could be in trouble, as both parts can be
removed/repurposed independently ...
>
> Current vmemmap code populates PMD with PMD_SIZE if empty, and with basepages
> if there are still holes.
>
>>> Assume we have sizeof(struct page) == 56. A 128 MiB section
>>> spans 32768 pages - we need 32768 * sizeof(struct page)
>>> space for the vmemmap.
>>> With 64k pages we *can* use exactly one PMD. With 56k pages
>>> we need 448 individual (full!) pages for the vmemmap.
>>>
>>> IOW, we don't care how vmemmap code will do the mapping.
>>> vmemmap code has to get it right. IMHO, asserting it in
>>> this code is wrong.
>>>
>>>
>>> b) In this code, we really should only care about what
>>> memory onlining/offlining code can or can't do.
>>> We really only care that
>>>
>>> 1) size == memory_block_size_bytes()
>>> 2) remaining_size
>>> 3) IS_ALIGNED(remaining_size, pageblock_size);
>
> I agree with the above, but see below:
>
>>> Okay, please document the statement about single sections, that's
>>> important to understand what's happening.
>>>
>>> My take would be
>>>
>>> bool mhp_supports_memmap_on_memory(unsigned long size)
>>> {
>>> /*
>>> * Note: We calculate for a single memory section. The calculation
>>> */
>>> unsigned long nr_vmemmap_pages = SECTION_SIZE / PAGE_SIZE;
>>> unsigned long vmemmap_size = nr_vmemmap_pages * sizeof(struct page);
>>> unsigned long remaining_size = size - vmemmap_size;
>
> While it might be true that we need to back off from populating with altmap in
> case PMDs are not going to be fully populated because of the size of the struct
> page (I am not still not sure though as I said above, other usecases might not
> care at all), I would go __for now__ with placing vmemmap_size == PMD_SIZE in
> the check below as well.
>
> If the check comes true, we know that we fully populate PMDs when populating
> sections, so the feature can be used.
>
> Then I commit to have a look whether we need to back off in vmemmap-populating
> code in case altmap && !NOT_FULLY_POPULATED_PMDS.
>
> What do you think?
If we check for
IS_ALIGNED(nr_vmemmap_pages, PMD_SIZE), please add a proper TODO comment
that this is most probably the wrong place to take care of this.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists