lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110530143109.GH19505@random.random>
Date:	Mon, 30 May 2011 16:31:09 +0200
From:	Andrea Arcangeli <aarcange@...hat.com>
To:	Mel Gorman <mel@....ul.ie>
Cc:	akpm@...ux-foundation.org, Ury Stankevich <urykhy@...il.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, stable@...nel.org
Subject: Re: [PATCH] mm: compaction: Abort compaction if too many pages are
 isolated and caller is asynchronous

Hi Mel and everyone,

On Mon, May 30, 2011 at 02:13:00PM +0100, Mel Gorman wrote:
> Asynchronous compaction is used when promoting to huge pages. This is
> all very nice but if there are a number of processes in compacting
> memory, a large number of pages can be isolated. An "asynchronous"
> process can stall for long periods of time as a result with a user
> reporting that firefox can stall for 10s of seconds. This patch aborts
> asynchronous compaction if too many pages are isolated as it's better to
> fail a hugepage promotion than stall a process.
> 
> If accepted, this should also be considered for 2.6.39-stable. It should
> also be considered for 2.6.38-stable but ideally [11bc82d6: mm:
> compaction: Use async migration for __GFP_NO_KSWAPD and enforce no
> writeback] would be applied to 2.6.38 before consideration.

Is this supposed to fix the stall with khugepaged in D state and other
processes in D state?

zoneinfo showed a nr_isolated_file = -1, I don't think that meant
compaction had 4g pages isolated really considering it moves from
-1,0, 1. So I'm unsure if this fix could be right if the problem is
the hang with khugepaged in D state reported, so far that looked more
like a bug with PREEMPT in the vmstat accounting of nr_isolated_file
that trips in too_many_isolated of both vmscan.c and compaction.c with
PREEMPT=y. Or are you fixing a different problem?

Or how do you explain this -1 value out of nr_isolated_file? Clearly
when that value goes to -1, compaction.c:too_many_isolated will hang,
I think we should fix the -1 value before worrying about the rest...

grep nr_isolated_file zoneinfo-khugepaged 
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
    nr_isolated_file 1
    nr_isolated_file 4294967295
    nr_isolated_file 0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ