lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 16 Jul 2014 13:14:26 +0200
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Joonsoo Kim <iamjoonsoo.kim@....com>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Rik van Riel <riel@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Mel Gorman <mgorman@...e.de>,
	Johannes Weiner <hannes@...xchg.org>,
	Minchan Kim <minchan@...nel.org>,
	Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>,
	Zhang Yanfei <zhangyanfei@...fujitsu.com>,
	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
	Tang Chen <tangchen@...fujitsu.com>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
	Wen Congyang <wency@...fujitsu.com>,
	Marek Szyprowski <m.szyprowski@...sung.com>,
	Michal Nazarewicz <mina86@...a86.com>,
	Laura Abbott <lauraa@...eaurora.org>,
	Heesub Shin <heesub.shin@...sung.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Ritesh Harjani <ritesh.list@...il.com>,
	t.stanislaws@...sung.com, Gioh Kim <gioh.kim@....com>,
	linux-mm@...ck.org, Lisa Du <cldu@...vell.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/10] fix freepage count problems due to memory isolation

On 07/16/2014 10:43 AM, Joonsoo Kim wrote:
>> I think your plan of multiple parallel CMA allocations (and thus
>> multiple parallel isolations) is also possible. The isolate pcplists
>> can be shared by pages coming from multiple parallel isolations. But
>> the flush operation needs a pfn start/end parameters to only flush
>> pages belonging to the given isolation. That might mean a bit of
>> inefficient list traversing, but I don't think it's a problem.
> 
> I think that special pcplist would cause a problem if we should check
> pfn range. If there are too many pages on this pcplist, move pages from
> this pcplist to isolate freelist takes too long time in irq context and
> system could be broken. This operation cannot be easily stopped because
> it is initiated by IPI on other cpu and starter of this IPI expect that
> all pages on other cpus' pcplist are moved properly when returning
> from on_each_cpu().
> 
> And, if there are so many pages, serious lock contention would happen
> in this case.

Hm I see. So what if it wasn't a special pcplist, but a special "free list"
where the pages would be just linked together as on pcplist, regardless of
order, and would not merge until the CPU that drives the memory isolation
process decides it is safe to flush them away. That would remove the need for
IPI's and provide the same guarantees I think.

> Anyway, my idea's key point is using PageIsolated() to distinguish
> isolated page, instead of using PageBuddy(). If page is PageIsolated(),

Is PageIsolated a completely new page flag? Those are a limited resource so I
would expect some resistance to such approach. Or a new special page->_mapcount
value? That could maybe work.

> it isn't handled as freepage although it is in buddy allocator. During free,
> page with MIGRATETYPE_ISOLATE will be marked as PageIsolated() and
> won't be merged and counted for freepage.

OK. Preventing wrong merging is the key point and this should work.

> When we move pages from normal buddy list to isolate buddy
> list, we check PageBuddy() and subtract number of PageBuddy() pages

Do we really need to check PageBuddy()? Could a page get marked as PageIsolate()
but still go to normal list instead of isolate list?

> from number of freepage. And, change page from PageBuddy() to PageIsolated()
> since it is handled as isolated page at this point. In this way, freepage
> count will be correct.
> 
> Unisolation can be done by similar approach.
> 
> I made prototype of this approach and it isn't intrusive to core
> allocator compared to my previous patchset.
> 
> Make sense?

I think so :)

> Thanks.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ