linux-kernel - Re: [PATCH 5/7] mm: move zone iterator outside of deferred_init

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200501024539.tnjuybydwe3r4u2x@ca-dmjordan1.us.oracle.com>
Date:   Thu, 30 Apr 2020 22:45:39 -0400
From:   Daniel Jordan <daniel.m.jordan@...cle.com>
To:     Alexander Duyck <alexander.h.duyck@...ux.intel.com>
Cc:     Daniel Jordan <daniel.m.jordan@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Steffen Klassert <steffen.klassert@...unet.com>,
        Alex Williamson <alex.williamson@...hat.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        David Hildenbrand <david@...hat.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Jonathan Corbet <corbet@....net>,
        Josh Triplett <josh@...htriplett.org>,
        Kirill Tkhai <ktkhai@...tuozzo.com>,
        Michal Hocko <mhocko@...nel.org>, Pavel Machek <pavel@....cz>,
        Pavel Tatashin <pasha.tatashin@...een.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Shile Zhang <shile.zhang@...ux.alibaba.com>,
        Tejun Heo <tj@...nel.org>, Zi Yan <ziy@...dia.com>,
        linux-crypto@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/7] mm: move zone iterator outside of
 deferred_init_maxorder()

Hi Alex,

On Thu, Apr 30, 2020 at 02:43:28PM -0700, Alexander Duyck wrote:
> On 4/30/2020 1:11 PM, Daniel Jordan wrote:
> > padata will soon divide up pfn ranges between threads when parallelizing
> > deferred init, and deferred_init_maxorder() complicates that by using an
> > opaque index in addition to start and end pfns.  Move the index outside
> > the function to make splitting the job easier, and simplify the code
> > while at it.
> > 
> > deferred_init_maxorder() now always iterates within a single pfn range
> > instead of potentially multiple ranges, and advances start_pfn to the
> > end of that range instead of the max-order block so partial pfn ranges
> > in the block aren't skipped in a later iteration.  The section alignment
> > check in deferred_grow_zone() is removed as well since this alignment is
> > no longer guaranteed.  It's not clear what value the alignment provided
> > originally.
> > 
> > Signed-off-by: Daniel Jordan <daniel.m.jordan@...cle.com>
> 
> So part of the reason for splitting it up along section aligned boundaries
> was because we already had an existing functionality in deferred_grow_zone
> that was going in and pulling out a section aligned chunk and processing it
> to prepare enough memory for other threads to keep running. I suspect that
> the section alignment was done because normally I believe that is also the
> alignment for memory onlining.

I think Pavel added that functionality, maybe he could confirm.

My impression was that the reason deferred_grow_zone aligned the requested
order up to a section was to make enough memory available to avoid being called
on every allocation.

> With this already breaking things up over multiple threads how does this
> work with deferred_grow_zone? Which thread is it trying to allocate from if
> it needs to allocate some memory for itself?

I may not be following your question, but deferred_grow_zone doesn't allocate
memory during the multithreading in deferred_init_memmap because the latter
sets first_deferred_pfn so that deferred_grow_zone bails early.

> Also what is to prevent a worker from stop deferred_grow_zone from bailing
> out in the middle of a max order page block if there is a hole in the middle
> of the block?

deferred_grow_zone remains singlethreaded.  It could stop in the middle of a
max order block, but it can't run concurrently with deferred_init_memmap, as
per above, so if deferred_init_memmap were to init 'n free the remaining part
of the block, the previous portion would have already been initialized.