lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4gp18-CRADqrqAbR0SnjKBoPaTyL_oaEyyNPJOeLybayg@mail.gmail.com>
Date:   Tue, 16 Jul 2019 19:28:54 -0700
From:   Dan Williams <dan.j.williams@...el.com>
To:     Oscar Salvador <osalvador@...e.de>
Cc:     "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>,
        Pavel Tatashin <pasha.tatashin@...een.com>,
        Michal Hocko <mhocko@...e.com>, Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] mm,memory_hotplug: Fix shrink_{zone,node}_span

On Mon, Jul 15, 2019 at 2:24 PM Oscar Salvador <osalvador@...e.de> wrote:
>
> On Mon, 2019-07-15 at 21:41 +0530, Aneesh Kumar K.V wrote:
> > Oscar Salvador <osalvador@...e.de> writes:
> >
> > > Since [1], shrink_{zone,node}_span work on PAGES_PER_SUBSECTION
> > > granularity.
> > > The problem is that deactivation of the section occurs later on in
> > > sparse_remove_section, so pfn_valid()->pfn_section_valid() will
> > > always return
> > > true before we deactivate the {sub}section.
> >
> > Can you explain this more? The patch doesn't update section_mem_map
> > update sequence. So what changed? What is the problem in finding
> > pfn_valid() return true there?
>
> I realized that the changelog was quite modest, so a better explanation
>  will follow.
>
> Let us analize what shrink_{zone,node}_span does.
> We have to remember that shrink_zone_span gets called every time a
> section is to be removed.
>
> There can be three possibilites:
>
> 1) section to be removed is the first one of the zone
> 2) section to be removed is the last one of the zone
> 3) section to be removed falls in the middle
>
> For 1) and 2) cases, we will try to find the next section from
> bottom/top, and in the third case we will check whether the section
> contains only holes.
>
> Now, let us take the example where a ZONE contains only 1 section, and
> we remove it.
> The last loop of shrink_zone_span, will check for {start_pfn,end_pfn]
> PAGES_PER_SECTION block the following:
>
> - section is valid
> - pfn relates to the current zone/nid
> - section is not the section to be removed
>
> Since we only got 1 section here, the check "start_pfn == pfn" will make us to continue the loop and then we are done.
>
> Now, what happens after the patch?
>
> We increment pfn on subsection basis, since "start_pfn == pfn", we jump
> to the next sub-section (pfn+512), and call pfn_valid()-
> >pfn_section_valid().
> Since section has not been yet deactivded, pfn_section_valid() will
> return true, and we will repeat this until the end of the loop.
>
> What should happen instead is:
>
> - we deactivate the {sub}-section before calling
> shirnk_{zone,node}_span
> - calls to pfn_valid() will now return false for the sections that have
> been deactivated, and so we will get the pfn from the next activaded
> sub-section, or nothing if the section is empty (section do not contain
> active sub-sections).
>
> The example relates to the last loop in shrink_zone_span, but the same
> applies to find_{smalles,biggest}_section.
>
> Please, note that we could probably do some hack like replacing:
>
> start_pfn == pfn
>
> with
>
> pfn < end_pfn
>
> But the way to fix this is to 1) deactivate {sub}-section and 2) let
> shrink_{node,zone}_span find the next active {sub-section}.
>
> I hope this makes it more clear.

This makes it more clear that the problem is with the "start_pfn ==
pfn" check relative to subsections, but it does not clarify why it
needs to clear pfn_valid() before calling shrink_zone_span().
Sections were not invalidated prior to shrink_zone_span() in the
pre-subsection implementation and it seems all we need is to keep the
same semantic. I.e. skip the range that is currently being removed:

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 37d49579ac15..b69832db442b 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -422,8 +422,8 @@ static void shrink_zone_span(struct zone *zone,
unsigned long start_pfn,
                if (page_zone(pfn_to_page(pfn)) != zone)
                        continue;

-                /* If the section is current section, it continues the loop */
-               if (start_pfn == pfn)
+                /* If the sub-section is current span being removed, skip */
+               if (pfn >= start_pfn && pfn < end_pfn)
                        continue;

                /* If we find valid section, we have nothing to do */


I otherwise don't follow why we would need to deactivate prior to
__remove_zone().

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ