linux-kernel - Re: [PATCH] mm: cma: allocate pages from CMA if NR_FREE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20121123044250.GG5121@bbox>
Date:	Fri, 23 Nov 2012 13:42:50 +0900
From:	Minchan Kim <minchan@...nel.org>
To:	Marek Szyprowski <m.szyprowski@...sung.com>
Cc:	linux-mm@...ck.org, linaro-mm-sig@...ts.linaro.org,
	linux-kernel@...r.kernel.org,
	Kyungmin Park <kyungmin.park@...sung.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mel@....ul.ie>,
	Michal Nazarewicz <mina86@...a86.com>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>
Subject: Re: [PATCH] mm: cma: allocate pages from CMA if NR_FREE_PAGES
 approaches low water mark

Hi Marek,

On Wed, Nov 21, 2012 at 04:50:45PM +0100, Marek Szyprowski wrote:
> Hello,
> 
> On 11/21/2012 2:05 AM, Minchan Kim wrote:
> >On Tue, Nov 20, 2012 at 03:49:35PM +0100, Marek Szyprowski wrote:
> >> Hello,
> >>
> >> On 11/20/2012 1:01 AM, Minchan Kim wrote:
> >> >Hi Marek,
> >> >
> >> >On Mon, Nov 12, 2012 at 09:59:42AM +0100, Marek Szyprowski wrote:
> >> >> It has been observed that system tends to keep a lot of CMA free pages
> >> >> even in very high memory pressure use cases. The CMA fallback for movable
> >> >
> >> >CMA free pages are just fallback for movable pages so if user requires many
> >> >user pages, it ends up consuming cma free pages after out of movable pages.
> >> >What do you mean that system tend to keep free pages even in very
> >> >high memory pressure?
> >> >> pages is used very rarely, only when system is completely pruned from
> >> >> MOVABLE pages, what usually means that the out-of-memory even will be
> >> >> triggered very soon. To avoid such situation and make better use of CMA
> >> >
> >> >Why does OOM is triggered very soon if movable pages are burned out while
> >> >there are many cma pages?
> >> >
> >> >It seems I can't understand your point quitely.
> >> >Please make your problem clear for silly me to understand clearly.
> >>
> >> Right now running out of 'plain' movable pages is the only possibility to
> >> get movable pages allocated from CMA. On the other hand running out of
> >> 'plain' movable pages is very deadly for the system, as movable pageblocks
> >> are also the main fallbacks for reclaimable and non-movable pages.
> >>
> >> Then, once we run out of movable pages and kernel needs non-mobable or
> >> reclaimable page (what happens quite often), it usually triggers OOM to
> >> satisfy the memory needs. Such OOM is very strange, especially on a system
> >> with dozen of megabytes of CMA memory, having most of them free at the OOM
> >> event. By high memory pressure I mean the high memory usage.
> >
> >So your concern is that too many free pages in MIGRATE_CMA when OOM happens
> >is odd? It's natural with considering CMA design which kernel never fallback
> >non-movable page allocation to CMA area. I guess it's not a your concern.
> 
> My concern is how to minimize memory waste with CMA.
> 
> >Let's think below extreme cases.
> >
> >= Before =
> >
> >* 1000M DRAM system.
> >* 400M kernel used pages.
> >* 300M movable used pages.
> >* 300M cma freed pages.
> >
> >1. kernel want to request 400M non-movable memory, additionally.
> >2. VM start to reclaim 300M movable pages.
> >3. But it's not enough to meet 400M request.
> >4. go to OOM. (It's natural)
> >
> >= After(with your patch) =
> >
> >* 1000M DRAM system.
> >* 400M kernel used pages.
> >* 300M movable *freed* pages.
> >* 300M cma used pages(by your patch, I simplified your concept)
> >
> >1. kernel want to request 400M non-movable memory.
> >2. 300M movable freed pages isn't enough to meet 400M request.
> >3. Also, there is no point to reclaim CMA pages for non-movable allocation.
> >4. go to OOM. (It's natural)
> >
> >There is no difference between before and after in allocation POV.
> >Let's think another example.
> >
> >= Before =
> >
> >* 1000M DRAM system.
> >* 400M kernel used pages.
> >* 300M movable used pages.
> >* 300M cma freed pages.
> >
> >1. kernel want to request 300M non-movable memory.
> >2. VM start to reclaim 300M movable pages.
> >3. It's enough to meet 300M request.
> >4. happy end
> >
> >= After(with your patch) =
> >
> >* 1000M DRAM system.
> >* 400M kernel used pages.
> >* 300M movable *freed* pages.
> >* 300M cma used pages(by your patch, I simplified your concept)
> >
> >1. kernel want to request 300M non-movable memory.
> >2. 300M movable freed pages is enough to meet 300M request.
> >3. happy end.
> >
> >There is no difference in allocation POV, too.
> 
> Those cases are just theoretical, out-of-real live examples. In real world
> kernel allocates (and frees) non-movable memory in small portions while
> system is running. Typically keeping some amount of free 'plain' movable
> pages is enough to make kernel happy about any kind of allocations
> (especially non-movable). This requirement is in complete contrast to the
> current fallback mechanism, which activates only when kernel runs out of
> movable pages completely.
> 
> >So I guess that if you see OOM while there are many movable pages,
> >I think principal problem is VM reclaimer which should try to reclaim
> >best effort if there are freeable movable pages. If VM reclaimer has
> >some problem for your workload, firstly we should try fix it rather than
> >adding such heuristic to hot path. Otherwise, if you see OOM while there
> >are many free CMA pages, it's not odd to me.
> 
> Frankly I don't see how reclaim procedure can ensure that it will be
> always possible to allocate non-movable pages with current fallback
> mechanism,
> which is used only when kernel runs out of pages of a given type. Could you
> explain how would You like to change the reclaim procedure to avoid
> the above
> situation?

What I have a mind is following as.

1. Reclaimer should migrate MIGRATE_MOVABLE into MIGRATE_CMA
   if there are free space in MIGRATE_CMA so VM could allocate
   non-movalbe pages with MIGRATE_MOVABLE fallback.

2. Reclaimer should consider non-movable page allocation.
   I mean reclaimer can reclaim MIGRATE_CMA pages when memory pressure happens
   by request of non-movable page but it is useless and such unnecessary reclaim
   hit performance. So reclaimer should reclaim target pages(ie, MIGRATE_MOVABLE)

3. If reclaiming got failed by some reason(ex, they are working set),
   we should reclaim MIGRATE_CMA and migrate MIGRATE_MOVABLE pages to MIGRATE_CMA.
   So kernel allocatio would be succeeded.

Above migration scheme is important for embedded system which don't have a swap
because they has a limit to reclaim anonymous pages in MIGRATE_MOVABLE.

Will take a look when I have a time.
-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/