lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54F6ADD2.3080403@huawei.com>
Date:	Wed, 4 Mar 2015 15:01:38 +0800
From:	Xishi Qiu <qiuxishi@...wei.com>
To:	Gu Zheng <guz.fnst@...fujitsu.com>
CC:	Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tang Chen <tangchen@...fujitsu.com>,
	Yinghai Lu <yinghai@...nel.org>, Linux MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Toshi Kani <toshi.kani@...com>, Mel Gorman <mgorman@...e.de>,
	Tejun Heo <tj@...nel.org>, Xiexiuqi <xiexiuqi@...wei.com>,
	Hanjun Guo <guohanjun@...wei.com>
Subject: Re: node-hotplug: is memset 0 safe in try_offline_node()?

On 2015/3/4 11:53, Gu Zheng wrote:

> Hi Xishi,
> 
> On 03/04/2015 10:22 AM, Xishi Qiu wrote:
> 
>> On 2015/3/3 18:20, Gu Zheng wrote:
>>
>>> Hi Xishi,
>>> On 03/03/2015 11:30 AM, Xishi Qiu wrote:
>>>
>>>> When hot-remove a numa node, we will clear pgdat,
>>>> but is memset 0 safe in try_offline_node()?
>>>
>>> It is not safe here. In fact, this is a temporary solution here.
>>> As you know, pgdat is accessed lock-less now, so protection
>>> mechanism (RCU?) is needed to make it completely safe here,
>>> but it seems a bit over-kill.
>>>
>>>>
>>>> process A:			offline node XX:
>>>> for_each_populated_zone()
>>>> find online node XX
>>>> cond_resched()
>>>> 				offline cpu and memory, then try_offline_node()
>>>> 				node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat))
>>>> access node XX's pgdat
>>>> NULL pointer access error
>>>
>>> It's possible, but I did not meet this condition, did you?
>>>
>>
>> Yes, we test hot-add/hot-remove node with stress, and meet the following
>> call trace several times.
> 
> Thanks.
> 
>>
>> 	next_online_pgdat()
>> 		int nid = next_online_node(pgdat->node_id);  // it's here, pgdat is NULL
> 
> 	memset(pgdat, 0, sizeof(*pgdat));
> This memset just sets the context of pgdat to 0, but it will not free pgdat, so the *pgdat is
> NULL* is strange here.
> But anyway, the bug is real, we must fix it.

next_zone()
	pg_data_t *pgdat = zone->zone_pgdat;  // I think this pgdat is NULL, and NODE_DATA() is not NULL.
	...
	pgdat = next_online_pgdat(pgdat);
		int nid = next_online_node(pgdat->node_id);  // so here is the null pointer access

Thanks for your new patch, I'll test it.

Thanks,
Xishi Qiu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ