linux-kernel - Re: [RFC PATCH 0/3] Aggressively allocate the pages on cma reserved memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 15 May 2014 11:10:55 +0900
From:	Joonsoo Kim <iamjoonsoo.kim@....com>
To:	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
Cc:	Marek Szyprowski <m.szyprowski@...sung.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Mel Gorman <mgorman@...e.de>,
	Laura Abbott <lauraa@...eaurora.org>,
	Minchan Kim <minchan@...nel.org>,
	Heesub Shin <heesub.shin@...sung.com>,
	Michal Nazarewicz <mina86@...a86.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	Kyungmin Park <kyungmin.park@...sung.com>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
	'Tomasz Stanislawski' <t.stanislaws@...sung.com>
Subject: Re: [RFC PATCH 0/3] Aggressively allocate the pages on cma reserved
 memory

On Wed, May 14, 2014 at 03:14:30PM +0530, Aneesh Kumar K.V wrote:
> Joonsoo Kim <iamjoonsoo.kim@....com> writes:
> 
> > On Fri, May 09, 2014 at 02:39:20PM +0200, Marek Szyprowski wrote:
> >> Hello,
> >> 
> >> On 2014-05-08 02:32, Joonsoo Kim wrote:
> >> >This series tries to improve CMA.
> >> >
> >> >CMA is introduced to provide physically contiguous pages at runtime
> >> >without reserving memory area. But, current implementation works like as
> >> >reserving memory approach, because allocation on cma reserved region only
> >> >occurs as fallback of migrate_movable allocation. We can allocate from it
> >> >when there is no movable page. In that situation, kswapd would be invoked
> >> >easily since unmovable and reclaimable allocation consider
> >> >(free pages - free CMA pages) as free memory on the system and free memory
> >> >may be lower than high watermark in that case. If kswapd start to reclaim
> >> >memory, then fallback allocation doesn't occur much.
> >> >
> >> >In my experiment, I found that if system memory has 1024 MB memory and
> >> >has 512 MB reserved memory for CMA, kswapd is mostly invoked around
> >> >the 512MB free memory boundary. And invoked kswapd tries to make free
> >> >memory until (free pages - free CMA pages) is higher than high watermark,
> >> >so free memory on meminfo is moving around 512MB boundary consistently.
> >> >
> >> >To fix this problem, we should allocate the pages on cma reserved memory
> >> >more aggressively and intelligenetly. Patch 2 implements the solution.
> >> >Patch 1 is the simple optimization which remove useless re-trial and patch 3
> >> >is for removing useless alloc flag, so these are not important.
> >> >See patch 2 for more detailed description.
> >> >
> >> >This patchset is based on v3.15-rc4.
> >> 
> >> Thanks for posting those patches. It basically reminds me the
> >> following discussion:
> >> http://thread.gmane.org/gmane.linux.kernel/1391989/focus=1399524
> >> 
> >> Your approach is basically the same. I hope that your patches can be
> >> improved
> >> in such a way that they will be accepted by mm maintainers. I only
> >> wonder if the
> >> third patch is really necessary. Without it kswapd wakeup might be
> >> still avoided
> >> in some cases.
> >
> > Hello,
> >
> > Oh... I didn't know that patch and discussion, because I have no interest
> > on CMA at that time. Your approach looks similar to #1
> > approach of mine and could have same problem of #1 approach which I mentioned
> > in patch 2/3. Please refer that patch description. :)
> 
> IIUC that patch also interleave right ?
> 
> +#ifdef CONFIG_CMA
> +	unsigned long nr_free = zone_page_state(zone, NR_FREE_PAGES);
> +	unsigned long nr_cma_free = zone_page_state(zone, NR_FREE_CMA_PAGES);
> +
> +	if (migratetype == MIGRATE_MOVABLE && nr_cma_free &&
> +	    nr_free - nr_cma_free < 2 * low_wmark_pages(zone))
> +		migratetype = MIGRATE_CMA;
> +#endif /* CONFIG_CMA */

Hello,

This is not interleave in my point of view. This logic will allocate
free movable pages until hitting 2 * low_wmark, and then allocate free
cma pages. Interleave that I mean is something like round-robin policy
with no constraint like above.

> 
> That doesn't always prefer CMA region. It would be nice to
> understand why grouping in pageblock_nr_pages is beneficial. Also in
> your patch you decrement nr_try_cma for every 'order' allocation. Why ?

pageblock_nr_pages is just magic value with no rationale. :)
But we need grouping, because without it, we can't get physically
contiguous pages. When we allocate the pages for page cache, readahead
logic will try to allocate 32 pages. If we don't use grouping, disk
I/O for these pages can't be handled by one I/O request on some devices.
I'm not familiar to I/O device, please let me correct.

And, yes, I will consider 'order' allocation when inc/dec nr_try_cma.

> 
> +	if (zone->nr_try_cma) {
> +		/* Okay. Now, we can try to allocate the page from cma region */
> +		zone->nr_try_cma--;
> +		page = __rmqueue_smallest(zone, order, MIGRATE_CMA);
> +
> +		/* CMA pages can vanish through CMA allocation */
> +		if (unlikely(!page && order == 0))
> +			zone->nr_try_cma = 0;
> +
> +		return page;
> +	}
> 
> 
> If we fail above MIGRATE_CMA alloc should we return failure ? Why
> not try MOVABLE allocation on failure (ie fallthrough the code path) ?

This patch use fallthrough logic. If we fail on __rmqueue_cma(), it will
go __rmqueue() as usual.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/