lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 20 Dec 2018 17:50:36 +0800
From:   Pingfan Liu <kernelfans@...il.com>
To:     linux-mm@...ck.org
Cc:     Pingfan Liu <kernelfans@...il.com>, linuxppc-dev@...ts.ozlabs.org,
        x86@...nel.org, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Jonathan Cameron <Jonathan.Cameron@...wei.com>,
        David Rientjes <rientjes@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>
Subject: [PATCHv2 0/3] mm: bugfix for NULL reference in mm on all archs

This bug is original reported at https://lore.kernel.org/patchwork/patch/1020838/
In a short word, this bug should affect all archs, where a machine with a
numa-node having no memory, if nr_cpus prevents the instance of nodeA, and the
device on nodeA tries to allocate memory with device->numa_node info.
And node_zonelist(preferred_nid, gfp_mask) will panic due to uninstanced nodeA.

And there are two alternative methods to fix it.
-1st. Fix it in mm system
-2nd. Fix it in all archs independently, by online all possible nodes.

Originaly, I tries to fix it by the 1st method, while Michal suggests the 2nd one.
This series [1-2/3] tries to resolve some defect in v1, pointed out by Michal.
For discussion purpose, I send [3/3] in this thread, which tries to show e.g of
the 2nd method on powerpc platform.
For x86, I still help Michal to verify his patch on my test machine, please see:
https://lore.kernel.org/patchwork/comment/1208479/
https://lore.kernel.org/patchwork/comment/1210452/

It has already cost a little long time to find a solution, cc x86 and ppc mailing list
and hope their maintainers to give some suggestion to speed up the final solution.

Pingfan Liu (3):
  mm/numa: change the topo of build_zonelist_xx()
  mm/numa: build zonelist when alloc for device on offline node
  powerpc/numa: make all possible node be instanced against NULL
    reference in node_zonelist()

 arch/powerpc/mm/numa.c | 13 ++++++--
 include/linux/gfp.h    | 10 +++++-
 mm/page_alloc.c        | 85 ++++++++++++++++++++++++++++++++++++--------------
 3 files changed, 81 insertions(+), 27 deletions(-)

Cc: linuxppc-dev@...ts.ozlabs.org
Cc: x86@...nel.org
Cc: linux-kernel@...r.kernel.org
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: Michal Hocko <mhocko@...e.com>
Cc: Vlastimil Babka <vbabka@...e.cz>
Cc: Mike Rapoport <rppt@...ux.vnet.ibm.com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>
Cc: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: David Rientjes <rientjes@...gle.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Borislav Petkov <bp@...en8.de>
Cc: "H. Peter Anvin" <hpa@...or.com>
Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc: Paul Mackerras <paulus@...ba.org>
Cc: Michael Ellerman <mpe@...erman.id.au>
-- 
2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ