lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YCuDUG89KwQNbsjA@dhcp22.suse.cz>
Date:   Tue, 16 Feb 2021 09:33:20 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Mike Rapoport <rppt@...nel.org>
Cc:     Mel Gorman <mgorman@...e.de>, David Hildenbrand <david@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Baoquan He <bhe@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Chris Wilson <chris@...is-wilson.co.uk>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Ɓukasz Majczak <lma@...ihalf.com>,
        Mike Rapoport <rppt@...ux.ibm.com>, Qian Cai <cai@....pw>,
        "Sarvela, Tomi P" <tomi.p.sarvela@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vlastimil Babka <vbabka@...e.cz>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, stable@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH v5 1/1] mm: refactor initialization of struct page for
 holes in memory layout

On Mon 15-02-21 23:24:40, Mike Rapoport wrote:
> On Mon, Feb 15, 2021 at 10:00:31AM +0100, Michal Hocko wrote:
> > On Sun 14-02-21 20:00:16, Mike Rapoport wrote:
> > > On Fri, Feb 12, 2021 at 02:18:20PM +0100, Michal Hocko wrote:
> > 
> > > We can correctly set the zone links for the reserved pages for holes in the
> > > middle of a zone based on the architecture constraints and with only the
> > > holes in the beginning/end of the memory will be not spanned by any
> > > node/zone which in practice does not seem to be a problem as the VM_BUG_ON
> > > in set_pfnblock_flags_mask() never triggered on pfn 0.
> > 
> > I really fail to see what you mean by correct zone/node for a memory
> > range which is not associated with any real node.
> 
> We know architectural zone constraints, so we can have always have 1:1
> match from pfn to zone. Node indeed will be a guess.

That is true only for some zones. Also we do require those to be correct
when the memory is managed by the page allocator. I believe we can live
with incorrect zones when they are in holes.

> > > > I am sorry, I haven't followed previous discussions. Has the removal of
> > > > the VM_BUG_ON been considered as an immediate workaround?
> > > 
> > > It was never discussed, but I'm not sure it's a good idea.
> > > 
> > > Judging by the commit message that introduced the VM_BUG_ON (commit
> > > 86051ca5eaf5 ("mm: fix usemap initialization")) there was yet another
> > > inconsistency in the memory map that required a special care.
> > 
> > Can we actually explore that path before adding yet additional
> > complexity and potentially a very involved fix for a subtle problem?
> 
> This patch was intended as a fix for inconsistency of the memory map that
> is the root cause for triggering this VM_BUG_ON and other corner case
> problems. 
> 
> The previous version [1] is less involved as it does not extend node/zone
> spans.

I do understand that. And I am not objecting to the patch. I have to
confess I haven't digested it yet. Any changes to early memory
intialization have turned out to be subtle and corner cases only pop up
later. This is almost impossible to review just by reading the code.
That's why I am asking whether we want to address the specific VM_BUG_ON
first with something much less tricky and actually reviewable. And
that's why I am asking whether dropping the bug_on itself is safe to do
and use as a hot fix which should be easier to backport.

Longterm I am definitely supporting any change which will lead to a
fully initialized state. Whatever that means. One option would be to
simply never allow partial page blocks or even memory sections. This
would waste some memory but from what I have seen so far this would be
quite small amount on very rare setups. So it might turn out as a much
more easier and maintainable way forward.

> [1] https://lore.kernel.org/lkml/20210130221035.4169-3-rppt@kernel.org
> -- 
> Sincerely yours,
> Mike.

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ