[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8201bd1d-f617-491b-a10d-1fe689e9eb9b@redhat.com>
Date: Mon, 27 May 2024 09:53:45 +0200
From: David Hildenbrand <david@...hat.com>
To: Brendan Jackman <jackmanb@...gle.com>
Cc: Oscar Salvador <osalvador@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>, Mike Rapoport <rppt@...nel.org>,
Michal Hocko <mhocko@...e.com>, Anshuman Khandual
<anshuman.khandual@....com>, Vlastimil Babka <vbabka@...e.cz>,
Pavel Tatashin <pasha.tatashin@...een.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] mm,memory_hotplug: Remove un-taken lock
Am 24.05.24 um 14:02 schrieb Brendan Jackman:
> On Wed, May 22, 2024 at 05:24:17PM +0200, David Hildenbrand wrote:
>> On 22.05.24 16:27, Brendan Jackman wrote:
>>> On Wed, May 22, 2024 at 04:09:41PM +0200, David Hildenbrand wrote:
>
>>> By the way, some noob questions: am I OK with my assumption that it's
>>> fine for reader code to operate on zone spans that are both stale and
>>> "from the future"? thinking abstractly I guess that seeing a stale
>>> value when racing with offline_pages is roughly the same as seeing a
>>> value "from the future" when racing with online_pages?
>>
>> Right. PFN walkers should be using pfn_to_online_page(), where races are
>> possible but barely seen in practice.
>>
>> zone handlers like mm/compaction.c can likely deal with races, although it
>> might all be cleaner (and safer?) when using start+end. I recall it also
>> recalls on pfn_to_online_page().
>>
>> Regarding page_outside_zone_boundaries(), it should be fine if we can read
>> start+end atomically, that way we would not accidentally report "page
>> outside ..." when changing the start address. I think with your current
>> patch that might happen (although likely extremely hard to trigger) when
>> growing the zone at the start, reducing zone_start_pfn.
>
> Thanks a lot, this is very helpful
>
>>> Also, is it ever possible for pages to get removed and then added back
>>> and end up in a different zone than before?
>>
>> Yes. Changing between MOVABLE and NORMAL is possible and can easily be
>> triggered by offlining+re-onlining memory blocks.
>
> So, even if we make it impossible to see a totally bogus zone span,
> you can observe a stale/futuristic span which currently contains pages
> from a different zone?
Yes. Note that zones/nodes can easily overlap, so a page being spanned by
another zones is common and supported already.
>
> That seems to imply you could look up a page page from a PFN within
> zone A's apparent span, lock zone A and assume you can safely modify
> the freelist the page is on, but actually that page is now in zone B.
That's why we obtain the zone/node always from the page itself (stored in page
flags). This data can only change when offlining+reonlining memory (and
pfn_to_online_page() would refuse to hand out the page while temporarily online).
There were discussions around using RCU to improve pfn_to_online_page() racing
with memory offlining, but the motivation to do that has been rather small: we
barely see such races in practice. Memory offlining+re-onlining simply takes too
long.
>
> So for example:
>
> 1. compact_zone() sets cc->free_pfn based on zone_end_pfn
> 2. isolate_freepages() sets isolate_start_pfn = cc->free_pfn
> 3. isolate_freepages_block() looks up a page based on that PFN
> 3. ... then takes the cc->zone lock
> 4. ... then calls __isolate_free_page which removes the page from
> whatever freelist it's on.
>
> Is anything stopping part 4 from modifying a zone that wasn't locked
> in part 3?
Likely that overlapping zones already exist and are handled accordingly.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists