[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56FA7DC8.4000902@suse.cz>
Date: Tue, 29 Mar 2016 15:06:16 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Andrew Morton <akpm@...ux-foundation.org>,
Xishi Qiu <qiuxishi@...wei.com>
Cc: Joonsoo Kim <js1304@...il.com>,
David Rientjes <rientjes@...gle.com>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Laura Abbott <lauraa@...eaurora.org>, zhuhui@...omi.com,
wangxq10@....edu.cn, Linux MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: fix invalid node in alloc_migrate_target()
On 03/25/2016 08:22 PM, Andrew Morton wrote:
> On Fri, 25 Mar 2016 14:56:04 +0800 Xishi Qiu <qiuxishi@...wei.com> wrote:
>
>> It is incorrect to use next_node to find a target node, it will
>> return MAX_NUMNODES or invalid node. This will lead to crash in
>> buddy system allocation.
>>
>> ...
>>
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -289,11 +289,11 @@ struct page *alloc_migrate_target(struct page *page, unsigned long private,
>> * now as a simple work-around, we use the next node for destination.
>> */
>> if (PageHuge(page)) {
>> - nodemask_t src = nodemask_of_node(page_to_nid(page));
>> - nodemask_t dst;
>> - nodes_complement(dst, src);
>> + int node = next_online_node(page_to_nid(page));
>> + if (node == MAX_NUMNODES)
>> + node = first_online_node;
>> return alloc_huge_page_node(page_hstate(compound_head(page)),
>> - next_node(page_to_nid(page), dst));
>> + node);
>> }
>>
>> if (PageHighMem(page))
>
> Indeed. Can you tell us more about this circumstances under which the
> kernel will crash? I need to decide which kernel version(s) need the
> patch, but the changelog doesn't contain the info needed to make this
> decision (it should).
>
>
>
> next_node() isn't a very useful interface, really. Just about every
> caller does this:
>
>
> node = next_node(node, XXX);
> if (node == MAX_NUMNODES)
> node = first_node(XXX);
>
> so how about we write a function which does that, and stop open-coding
> the same thing everywhere?
Good idea.
> And I think your fix could then use such a function:
>
> int node = that_new_function(page_to_nid(page), node_online_map);
>
>
>
> Also, mm/mempolicy.c:offset_il_node() worries me:
>
> do {
> nid = next_node(nid, pol->v.nodes);
> c++;
> } while (c <= target);
>
> Can't `nid' hit MAX_NUMNODES?
AFAICS it can. interleave_nid() uses this and the nid is then used e.g.
in node_zonelist() where it's used for NODE_DATA(nid). That's quite
scary. It also predates git. Why don't we see crashes or KASAN finding this?
>
> And can someone please explain mem_cgroup_select_victim_node() to me?
> How can we hit the "node = numa_node_id()" path? Only if
> memcg->scan_nodes is empty? is that even valid? The comment seems to
> have not much to do with the code?
I understand the comment that it's valid to be empty and the comment
lists reasons why that can happen (with somewhat broken language). Note
that I didn't verify these reasons:
- we call this when hitting memcg limit, not when adding pages to LRU,
as adding to LRU means it would contain the given LRU's node
- adding to unevictable LRU means it's not added to scan_nodes (probably
because scanning unevictable lru would be useless)
- for other reasons (which?) it might have pages not on LRU and it's so
small there are no other pages that would be on LRU
> mpol_rebind_nodemask() is similar.
>
>
>
> Something like this?
>
>
> From: Andrew Morton <akpm@...ux-foundation.org>
> Subject: include/linux/nodemask.h: create next_node_in() helper
>
> Lots of code does
>
> node = next_node(node, XXX);
> if (node == MAX_NUMNODES)
> node = first_node(XXX);
>
> so create next_node_in() to do this and use it in various places.
>
> Cc: Xishi Qiu <qiuxishi@...wei.com>
> Cc: Vlastimil Babka <vbabka@...e.cz>
Acked-by: Vlastimil Babka <vbabka@...e.cz>
Patch doesn't address offset_il_node() which is good, because if it's
indeed buggy, it's serious and needs a non-cleanup patch.
Powered by blists - more mailing lists