lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <503DE262.20006@gmail.com>
Date:	Wed, 29 Aug 2012 17:35:30 +0800
From:	qiuxishi <qiuxishi@...il.com>
To:	Minchan Kim <minchan@...nel.org>
CC:	Jiang Liu <jiang.liu@...wei.com>, akpm@...ux-foundation.org,
	lliubbo@...il.com, mgorman@...e.de, kamezawa.hiroyu@...fujitsu.com,
	mhocko@...e.cz, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	qiuxishi@...wei.com, wujianguo@...wei.com, bessel.wang@...wei.com,
	guohanjun@...wei.com, chenkeping@...wei.com, yinghai@...nel.org,
	wency@...fujitsu.com
Subject: [PATCH] memory-hotplug: fix a drain pcp bug when offline pages

On 2012-8-22 16:37, Minchan Kim wrote:
> Hi Jiang,
> 
> On Wed, Aug 22, 2012 at 04:30:09PM +0800, Jiang Liu wrote:
>> On 2012-8-22 16:14, Minchan Kim wrote:
>>> On Wed, Aug 22, 2012 at 03:57:45PM +0800, qiuxishi wrote:
>>>> On 2012-8-22 11:34, Minchan Kim wrote:
>>>>> Hello Xishi,
>>>>>
>>>>> On Tue, Aug 21, 2012 at 08:12:05PM +0800, qiuxishi wrote:
>>>>>> From: Xishi Qiu <qiuxishi@...wei.com>
>>>>>>
>>>>>> When offline a section, we move all the free pages and pcp into MIGRATE_ISOLATE list first.
>>>>>> start_isolate_page_range()
>>>>>> 	set_migratetype_isolate()
>>>>>> 		drain_all_pages(),
>>>>>>
>>>>>> Here is a problem, it is not sure that pcp will be moved into MIGRATE_ISOLATE list. They may
>>>>>> be moved into MIGRATE_MOVABLE list because page_private() maybe 2. So when finish migrating
>>>>>> pages, the free pages from pcp may be allocated again, and faild in check_pages_isolated().
>>>>>> drain_all_pages()
>>>>>> 	drain_local_pages()
>>>>>> 		drain_pages()
>>>>>> 			free_pcppages_bulk()
>>>>>> 				__free_one_page(page, zone, 0, page_private(page));
>>>>>>
>>>>>> If we add move_freepages_block() after drain_all_pages(), it can not sure that all the pcp
>>>>>> will be moved into MIGRATE_ISOLATE list when the system works on high load. The free pages
>>>>>> which from pcp may immediately be allocated again.
>>>>>>
>>>>>> I think the similar bug described in http://marc.info/?t=134250882300003&r=1&w=2
>>>>>
>>>>> Yes. I reported the problem a few month ago but it's not real bug in practice
>>>>> but found by my eyes during looking the code so I wanted to confirm the problem.
>>>>>
>>>>> Do you find that problem in real practice? or just code review?
>>>>>
>>>>
>>>> I use /sys/devices/system/memory/soft_offline_page to offline a lot of pages when the
>>>> system works on high load, then I find some unknown zero refcount pages, such as
>>>> get_any_page: 0x650422: unknown zero refcount page type 19400c00000000
>>>> get_any_page: 0x650867: unknown zero refcount page type 19400c00000000
>>>>
>>>> soft_offline_page()
>>>> 	get_any_page()
>>>> 		set_migratetype_isolate()
>>>> 			drain_all_pages()
>>>>
>>>> I think after drain_all_pages(), pcp are moved into MIGRATE_MOVABLE list which managed by
>>>> buddy allocator, but they are allocated and becaome pcp again as the system works on high
>>>> load. There will be no this problem by applying this patch.
>>>>
>>>>> Anyway, I don't like your approach which I already considered because it hurts hotpath
>>>>> while the race is really unlikely. Get_pageblock_migratetype is never trivial.
>>>>> We should avoid the overhead in hotpath and move into memory-hotplug itself.
>>>>> Do you see my patch in https://patchwork.kernel.org/patch/1225081/ ?
>>>>
>>>> Yes, you are right, I will try to find another way to fix this problem.
>>>> How about doing this work in set_migratetype_isolate(), find the pcp and change the value
>>>> of private to get_pageblock_migratetype(page)?
>>>>
>>>
>>> Allocator doesn't have any lock when he allocates the page from pcp.
>>> How could you prevent race between allocator and memory-hotplug
>>> routine(ie, set_migratetype_isolate) without hurting hotpath?
>> Hi Minchan,
>> 	I have thought about using a jump label in the hot path, which won't cause big
>> performance drop, but it seems a little dirty. What's your thoughts?
> 
> I don't know static_key_false internal well.
> Questions.
> 
> 1. Is it implemented by all archs?
> 2. How is it work? It's almost zero on all archs?
> 3. Don't we really have any solution other than hacking the hotpath
>    (ie, order-0 page allocation)?
> 4. Please see my solution on above URL. Does it has any problem?
> 

Hi Minchan,

Yes, your patch does resolve this problem, it returns the failed flag in
__test_page_isolated_in_pageblock(), so memory offline will be failed.

My patch resolve this problem too, it drain pcp to MIGRATE_ISOLATE list,
so memory offline will be successful, but it causes big performance drop.

I think Gerry's method looks fine.

Thanks
Xishi Qiu

>>
>> 	migrate_type = page_private(page);
>> 	if (static_key_false(&memory_hotplug_inprogress))
>> 		migrate_type = get_pageblock_migratetype(page);
>> 	__free_one_page(page, zone, 0, migrate_type);
>>
>> 	Regards!
>> 	Gerry
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@...ck.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ