linux-kernel - Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <52E72083.4090703@cn.fujitsu.com>
Date:	Tue, 28 Jan 2014 11:14:11 +0800
From:	Tang Chen <tangchen@...fujitsu.com>
To:	Dave Jones <davej@...hat.com>,
	David Rientjes <rientjes@...gle.com>, tglx@...utronix.de,
	mingo@...hat.com, hpa@...or.com, akpm@...ux-foundation.org,
	zhangyanfei@...fujitsu.com, guz.fnst@...fujitsu.com,
	x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting
 kernel nodes to unhotpluggable.

On 01/28/2014 10:55 AM, Dave Jones wrote:
> On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote:
>   >  On 01/28/2014 08:32 AM, David Rientjes wrote:
>   >  >  On Wed, 22 Jan 2014, David Rientjes wrote:
>   >  >
>   >  >>>    arch/x86/mm/numa.c | 2 +-
>   >  >>>    1 file changed, 1 insertion(+), 1 deletion(-)
>   >  >>>
>   >  >>>  diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
>   >  >>>  index 81b2750..ebefeb7 100644
>   >  >>>  --- a/arch/x86/mm/numa.c
>   >  >>>  +++ b/arch/x86/mm/numa.c
>   >  >>>  @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
>   >  >>>    	}
>   >  >>>    }
>   >  >>>
>   >  >>>  +static nodemask_t numa_kernel_nodes __initdata;
>   >  >>>    static void __init numa_clear_kernel_node_hotplug(void)
>   >  >>>    {
>   >  >>>    	int i, nid;
>   >  >>>  -	nodemask_t numa_kernel_nodes;
>   >  >>>    	unsigned long start, end;
>   >  >>>    	struct memblock_type *type =&memblock.reserved;
>   >  >>>
>   >  >>
>   >  >>  Isn't this also a bugfix since you never initialize numa_kernel_nodes when
>   >  >>  it's allocated on the stack with NODE_MASK_NONE?
>   >  >>
>   >  >
>   >  >  This hasn't been answered and the patch still isn't in linux-kernel yet
>   >  >  Dave tested it as good.  I'm suspicious of the changelog that indicates
>   >  >  this nodemask is the result of a stack overflow itself which only manages
>   >  >  to reproduce itself in the init patch slightly more than 50% of the time.
>   >  >  How is that possible?
>   >  >
>   >  >  I think the changelog should indicate this also fixes an uninitialized
>   >  >  nodemask issue.
>   >
>   >  Hi David,
>   >
>   >  I'm still working on this problem, but unfortunately nothing new for now.
>   >  And the test till now shows no more problem here.
>   >
>   >  I'm digging into it, but need more time.
>   >
>   >  I'll resend a new patch and modify the changelog soon. Before we find the
>   >  root cause, I think we can use this patch as a temporary solution.
>
> Ok, I hit the 2nd bug again (oops in next_zones_zonelist...)
>
> I did a bisect with the patch above applied each step of the way.
> This time I got a plausible looking result....
>
>
> a0acda917284183f9b71e2d08b0aa0aea722b321 is the first bad commit
> commit a0acda917284183f9b71e2d08b0aa0aea722b321
> Author: Tang Chen<tangchen@...fujitsu.com>
> Date:   Tue Jan 21 15:49:32 2014 -0800
>
>      acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable
>
>
> Reverting this commit of course removes the whole function from above,
> so we haven't really learned anything new, other than that commit is broken,
> even after the above fix-up.

If we revert this commit, memory hot-remove won't be able to work.
Let's try to fix it before the merge window is close.

>
> 	Dave
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/