lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 9 Jun 2014 02:06:17 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Vlastimil Babka <vbabka@...e.cz>
cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Greg Thelen <gthelen@...gle.com>,
	Minchan Kim <minchan@...nel.org>, Mel Gorman <mgorman@...e.de>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Michal Nazarewicz <mina86@...a86.com>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Christoph Lameter <cl@...ux.com>,
	Rik van Riel <riel@...hat.com>
Subject: Re: [RFC PATCH 6/6] mm, compaction: don't migrate in blocks that
 cannot be fully compacted in async direct compaction

On Fri, 6 Jun 2014, Vlastimil Babka wrote:

> > Agreed.  I was thinking higher than 1GB would be possible once we have 
> > your series that does the pageblock skip for thp, I think the expense 
> > would be constant because we won't needlessly be migrating pages unless it 
> > has a good chance at succeeding.
> 
> Looks like a counter of iterations actually done in scanners, maintained in
> compact_control, would work better than any memory size based limit? It could
> better reflect the actual work done and thus latency. Maybe increase the counter
> also for migrations, with a higher cost than for a scanner iteration.
> 

I'm not sure we can expose that to be configurable by userspace in any 
meaningful way.  We'll want to be able to tune this depending on the size 
of the machine if we are to truly remove the need_resched() heuristic and 
give it a sane default.  I was thinking it would be similar to 
khugepaged's pages_to_scan value that it uses on each wakeup.

> > This does beg the question about parallel direct compactors, though, that 
> > will be contending on the same coarse zone->lru_lock locks and immediately 
> > aborting and falling back to PAGE_SIZE pages for thp faults that will be 
> > more likely if your patch to grab the high-order page and return it to the 
> > page allocator is merged.
> 
> Hm can you explain how the page capturing makes this worse? I don't see it.
> 

I was expecting that your patch to capture the high-order page made a 
difference because the zone watermark check doesn't imply the high-order 
page will be allocatable after we return to the page allocator to allocate 
it.  In that case, we terminated compaction prematurely.  If that's true, 
then it seems like no parallel thp allocator will be able to allocate 
memory that another direct compactor has freed without entering compaction 
itself on a fragmented machine, and thus an increase in zone->lru_lock 
contention if there's migratable memory.

Having 32 cpus fault thp memory and all entering compaction and contending 
(and aborting because of contention, currently) on zone->lru_lock is a 
really bad situation.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ