lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Nov 2011 10:13:13 +0000
From:	Mel Gorman <mgorman@...e.de>
To:	Dave Jones <davej@...hat.com>,
	Johannes Weiner <jweiner@...hat.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Jan Kara <jack@...e.cz>, Andy Isaacson <adi@...apodia.org>,
	linux-kernel@...r.kernel.org, linux-mm@...r.kernel.org,
	kernel-team@...oraproject.org
Subject: Re: long sleep_on_page delays writing to slow storage

On Mon, Nov 14, 2011 at 01:47:17PM -0500, Dave Jones wrote:
> On Thu, Nov 10, 2011 at 10:34:42AM +0100, Johannes Weiner wrote:
>  
>  > > I wonder if a change like this would be enough?
>  > > 
>  > >        sync_migration = !(gfp_mask & __GFP_NO_KSWAPD);
>  > > 
>  > > But even if hidden in a new function, the main downside overall is the
>  > > fact we'll pass one more var through the stack of fast paths.
>  > > 
>  > > Johannes I recall you reported this too and Mel suggested the above
>  > > change, did it help in the end?
>  > 
>  > Yes, it completely fixed the latency problem.
>  > 
>  > That said, I haven't looked at the impact on the THP success rate, but
>  > a regression there is probably less severe than half-minute-stalls in
>  > interactive applications.
> 
> FWIW, we've had a few reports from Fedora users since we moved to 3.x kernels
> about similar problems, so whatever the fix is for this should probably
> go to stable too.
> 

Agreed. I made note of that when I sent a smaller patch to Andrew so
that it would be picked up by distros.

> I could push an update for Fedora users to test the change above if
> that would be helpful ?
> 

It would be helpful if you could pick up the patch at
https://lkml.org/lkml/2011/11/10/173 as this is what I expect will
reach -stable eventually. It would be even better if one of the bug
reporters could test before and after that patch and report if it
fixes their problem or not.

If they are still experiencing major stalls, I have an experimental
script that may be able to capture stack traces of processes stalled
for more than 1 second. I've had some success with it locally so
maybe they could try it out to identify if it's THP or something else.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ