lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZlCB5bFnafw_zE8Z@google.com>
Date: Fri, 24 May 2024 12:02:45 +0000
From: Brendan Jackman <jackmanb@...gle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Oscar Salvador <osalvador@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mike Rapoport <rppt@...nel.org>, Michal Hocko <mhocko@...e.com>,
	Anshuman Khandual <anshuman.khandual@....com>,
	Vlastimil Babka <vbabka@...e.cz>,
	Pavel Tatashin <pasha.tatashin@...een.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] mm,memory_hotplug: Remove un-taken lock

On Wed, May 22, 2024 at 05:24:17PM +0200, David Hildenbrand wrote:
> On 22.05.24 16:27, Brendan Jackman wrote:
> > On Wed, May 22, 2024 at 04:09:41PM +0200, David Hildenbrand wrote:

> > By the way, some noob questions: am I OK with my assumption that it's
> > fine for reader code to operate on zone spans that are both stale and
> > "from the future"? thinking abstractly I guess that seeing a stale
> > value when racing with offline_pages is roughly the same as seeing a
> > value "from the future" when racing with online_pages?
> 
> Right. PFN walkers should be using pfn_to_online_page(), where races are
> possible but barely seen in practice.
> 
> zone handlers like mm/compaction.c can likely deal with races, although it
> might all be cleaner (and safer?) when using start+end. I recall it also
> recalls on pfn_to_online_page().
> 
> Regarding page_outside_zone_boundaries(), it should be fine if we can read
> start+end atomically, that way we would not accidentally report "page
> outside ..." when changing the start address. I think with your current
> patch that might happen (although likely extremely hard to trigger) when
> growing the zone at the start, reducing zone_start_pfn.

Thanks a lot, this is very helpful

> > Also, is it ever possible for pages to get removed and then added back
> > and end up in a different zone than before?
> 
> Yes. Changing between MOVABLE and NORMAL is possible and can easily be
> triggered by offlining+re-onlining memory blocks.

So, even if we make it impossible to see a totally bogus zone span,
you can observe a stale/futuristic span which currently contains pages
from a different zone?

That seems to imply you could look up a page page from a PFN within
zone A's apparent span, lock zone A and assume you can safely modify
the freelist the page is on, but actually that page is now in zone B.

So for example:

1. compact_zone() sets cc->free_pfn based on zone_end_pfn
  2. isolate_freepages() sets isolate_start_pfn = cc->free_pfn
    3. isolate_freepages_block() looks up a page based on that PFN
    3. ... then takes the cc->zone lock
    4. ... then calls __isolate_free_page which removes the page from
       whatever freelist it's on.

Is anything stopping part 4 from modifying a zone that wasn't locked
in part 3?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ