lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZEZxKJA/5aOfbZdX@dhcp22.suse.cz>
Date:   Mon, 24 Apr 2023 14:08:08 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc:     akpm@...ux-foundation.org, rppt@...nel.org, ying.huang@...el.com,
        mgorman@...hsingularity.net, vbabka@...e.cz, david@...hat.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] mm/page_alloc: add some comments to explain the
 possible hole in __pageblock_pfn_to_page()

On Mon 24-04-23 19:40:30, Baolin Wang wrote:
> 
> 
> On 4/24/2023 7:34 PM, Michal Hocko wrote:
> > On Mon 24-04-23 19:20:43, Baolin Wang wrote:
> > > 
> > > 
> > > On 4/24/2023 5:54 PM, Michal Hocko wrote:
> > > > On Sun 23-04-23 18:59:11, Baolin Wang wrote:
> > > > > Now the __pageblock_pfn_to_page() is used by set_zone_contiguous(), which
> > > > > checks whether the given zone contains holes, and uses pfn_to_online_page()
> > > > > to validate if the start pfn is online and valid, as well as using pfn_valid()
> > > > > to validate the end pfn.
> > > > > 
> > > > > However, the __pageblock_pfn_to_page() function may return non-NULL even
> > > > > if the end pfn of a pageblock is in a memory hole in some situations. For
> > > > > example, if the pageblock order is MAX_ORDER, which will fall into 2
> > > > > sub-sections, and the end pfn of the pageblock may be hole even though
> > > > > the start pfn is online and valid.
> > > > > 
> > > > > This did not break anything until now, but the zone continuous is fragile
> > > > > in this possible scenario. So as previous discussion[1], it is better to
> > > > > add some comments to explain this possible issue in case there are some
> > > > > future pfn walkers that rely on this.
> > > > > 
> > > > > [1] https://lore.kernel.org/all/87r0sdsmr6.fsf@yhuang6-desk2.ccr.corp.intel.com/
> > > > 
> > > > Do I remember correctly you've had a specific configuration that would
> > > > trigger this case?
> > > 
> > > Yes, I provided an example in previous thread [2] so show the
> > > __pageblock_pfn_to_page() is fragile in some cases.
> > > 
> > > [2] https://lore.kernel.org/all/52dfdd2e-9c99-eac4-233e-59919a24323e@linux.alibaba.com/
> > 
> > Please make it a part of the changelog.
> 
> Sure.
> 
> > > > 
> > > > > Signed-off-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
> > > > > ---
> > > > > Changes from v1:
> > > > >    - Update the comments per Ying and Mike, thanks.
> > > > > ---
> > > > >    mm/page_alloc.c | 7 +++++++
> > > > >    1 file changed, 7 insertions(+)
> > > > > 
> > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > > index 6457b64fe562..9756d66f471c 100644
> > > > > --- a/mm/page_alloc.c
> > > > > +++ b/mm/page_alloc.c
> > > > > @@ -1502,6 +1502,13 @@ void __free_pages_core(struct page *page, unsigned int order)
> > > > >     * interleaving within a single pageblock. It is therefore sufficient to check
> > > > >     * the first and last page of a pageblock and avoid checking each individual
> > > > >     * page in a pageblock.
> > > > > + *
> > > > > + * Note: the function may return non-NULL even if the end pfn of a pageblock
> > > > > + * is in a memory hole in some situations. For example, if the pageblock
> > > > > + * order is MAX_ORDER, which will fall into 2 sub-sections, and the end pfn
> > > > > + * of the pageblock may be hole even though the start pfn is online and valid.
> > > > > + * This did not break anything until now, but be careful about this possible
> > > > > + * issue when checking whether all pfns of a pageblock are valid.
> > > > 
> > > > It is not really clear what you should be doing (other than to be
> > > > careful which is not helpful much TBH) when you encounter this
> > > > situation. If the reality changes and this would break in the future
> > > > what would breakage look like? What should be done about that?
> > > 
> > > That depends on what the future pfn walkers do, which may access some hole
> > > memory with zero-init page frame. For example, if checking the
> > > __PageMovable() for a zero-init page frame, that will crash the system. But
> > > I can not list all the possible cases.
> > > 
> > > So how about below words?
> > > 
> > >   * Note: the function may return non-NULL even if the end pfn of a pageblock
> > >   * is in a memory hole in some situations. For example, if the pageblock
> > >   * order is MAX_ORDER, which will fall into 2 sub-sections, and the end pfn
> > >   * of the pageblock may be hole even though the start pfn is online and
> > > valid.
> > >   * This did not break anything until now, but be careful about this possible
> > >   * issue when checking whether all pfns of a pageblock are valid, that may
> > >   * lead to accessing empty page frame, and the worst case can crash the
> > > system.
> > >   * So you should use pfn_to_onlie_page() instead of pfn_valid() to valid the
> > >   * pfns in a pageblock if such case happens.
> > 
> > Does that mean that struct page is not initialized and PagePoisoned will
> > trigger or it is just zero-prefilled?
> 
> In the example I provided[2], these page frames of the hole memory are
> zero-prefilled.

OK, so make _that_ explicit in the comment. Essentially you want to say
that there are cases where we have zero-initialized struct pages for
memory holes. In general no pfn walker should touch a physical memory
range for pfn where the struct page doesn't contain any metadata it
recognizes. Zero fill struct pages do not contain any distinguishable
state so that makes it less of a problem.

All that being said I would reformulate the comment as follows:

	* Note: the function may return non-NULL struct page even for a
	* page block which contains a memory hole (i.e. there is no
	* physical memory for a subset of the pfn range). This should be
	* safe most of the time because struct pages are still zero
	* pre-filled and pfn walkers shouldn't touch any physical memory
	* range for which they do not recognize any specific metadata in
	* struct pages.


> [2] https://lore.kernel.org/all/52dfdd2e-9c99-eac4-233e-59919a24323e@linux.alibaba.com/

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ