lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 8 Dec 2021 09:30:34 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Alexey Makhalov <amakhalov@...are.com>
Cc:     David Hildenbrand <david@...hat.com>,
        Dennis Zhou <dennis@...nel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Oscar Salvador <osalvador@...e.de>, Tejun Heo <tj@...nel.org>,
        Christoph Lameter <cl@...ux.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH v3] mm: fix panic in __alloc_pages

On Wed 08-12-21 08:19:16, Alexey Makhalov wrote:
> Hi Michal,
> 
> > On Dec 8, 2021, at 12:04 AM, Michal Hocko <mhocko@...e.com> wrote:
> > 
> > On Tue 07-12-21 17:17:27, Alexey Makhalov wrote:
> >> 
> >> 
> >>> On Dec 7, 2021, at 9:13 AM, David Hildenbrand <david@...hat.com> wrote:
> >>> 
> >>> On 07.12.21 18:02, Alexey Makhalov wrote:
> >>>> 
> >>>> 
> >>>>> On Dec 7, 2021, at 8:36 AM, Michal Hocko <mhocko@...e.com> wrote:
> >>>>> 
> >>>>> On Tue 07-12-21 17:27:29, Michal Hocko wrote:
> >>>>> [...]
> >>>>>> So your proposal is to drop set_node_online from the patch and add it as
> >>>>>> a separate one which handles
> >>>>>> 	- sysfs part (i.e. do not register a node which doesn't span a
> >>>>>> 	  physical address space)
> >>>>>> 	- hotplug side of (drop the pgd allocation, register node lazily
> >>>>>> 	  when a first memblocks are registered)
> >>>>> 
> >>>>> In other words, the first stage
> >>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >>>>> index c5952749ad40..f9024ba09c53 100644
> >>>>> --- a/mm/page_alloc.c
> >>>>> +++ b/mm/page_alloc.c
> >>>>> @@ -6382,7 +6382,11 @@ static void __build_all_zonelists(void *data)
> >>>>> 	if (self && !node_online(self->node_id)) {
> >>>>> 		build_zonelists(self);
> >>>>> 	} else {
> >>>>> -		for_each_online_node(nid) {
> >>>>> +		/*
> >>>>> +		 * All possible nodes have pgdat preallocated
> >>>>> +		 * free_area_init
> >>>>> +		 */
> >>>>> +		for_each_node(nid) {
> >>>>> 			pg_data_t *pgdat = NODE_DATA(nid);
> >>>>> 
> >>>>> 			build_zonelists(pgdat);
> >>>> 
> >>>> Will it blow up memory usage for the nodes which might never be onlined?
> >>>> I prefer the idea of init on demand.
> >>>> 
> >>>> Even now there is an existing problem.
> >>>> In my experiments, I observed _huge_ memory consumption increase by increasing number
> >>>> of possible numa nodes. I’m going to report it in separate mail thread.
> >>> 
> >>> I already raised that PPC might be problematic in that regard. Which
> >>> architecture / setup do you have in mind that can have a lot of possible
> >>> nodes?
> >>> 
> >> It is x86_64 VMware VM, not the regular one, but specially configured (1 vCPU per node,
> >> with hot-plug support, 128 possible nodes)
> > 
> > This is slightly tangent but could you elaborate more on this setup and
> > reasoning behind it. I was already curious when you mentioned this
> > previously. Why would you want to have so many nodes and having 1:1 with
> > CPUs. What is the resulting NUMA topology?
> 
> This setup with 128 nodes was used purely for development purposes. That is when the issue
> with hot adding numa nodes was found.

OK, I see.

> Original issue presents even with feasible number of nodes.

Yes the issue is independent on the number of offline nodes currently.
The number of nodes is only interesting for the wasted amount of memory
if we are to allocate pgdat for each possible node.

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ