lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1608131323550.3291@file01.intranet.prod.int.rdu2.redhat.com>
Date:	Sat, 13 Aug 2016 13:34:29 -0400 (EDT)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	Michal Hocko <mhocko@...nel.org>
cc:	Mel Gorman <mgorman@...e.de>, NeilBrown <neilb@...e.com>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
	"dm-devel@...hat.com David Rientjes" <rientjes@...gle.com>,
	Ondrej Kozina <okozina@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [dm-devel] [RFC PATCH 2/2] mm, mempool: do not throttle
 PF_LESS_THROTTLE tasks



On Fri, 12 Aug 2016, Michal Hocko wrote:

> On Thu 04-08-16 14:49:41, Mikulas Patocka wrote:
> 
> > On Wed, 3 Aug 2016, Michal Hocko wrote:
> > 
> > > But the device congestion is not the only condition required for the
> > > throttling. The pgdat has also be marked congested which means that the
> > > LRU page scanner bumped into dirty/writeback/pg_reclaim pages at the
> > > tail of the LRU. That should only happen if we are rotating LRUs too
> > > quickly. AFAIU the reclaim shouldn't allow free ticket scanning in that
> > > situation.
> > 
> > The obvious problem here is that mempool allocations should sleep in 
> > mempool_alloc() on &pool->wait (until someone returns some entries into 
> > the mempool), they should not sleep inside the page allocator.
> 
> I agree that mempool_alloc should _primarily_ sleep on their own
> throttling mechanism. I am not questioning that. I am just saying that
> the page allocator has its own throttling which it relies on and that
> cannot be just ignored because that might have other undesirable side
> effects. So if the right approach is really to never throttle certain
> requests then we have to bail out from a congested nodes/zones as soon
> as the congestion is detected.
> 
> Now, I would like to see that something like that is _really_ necessary.

Currently, it is not a problem - device mapper reports the device as 
congested only if the underlying physical disks are congested.

But once we change it so that device mapper reports congested state on its 
own (when it has too many bios in progress), this starts being a problem.

I would add PF_NO_THROTTLE or __GFP_NO_THROTTLE to mempool_alloc.

Or - we can prevent the memory reclaim from throttling if we see both 
__GFP_NOMEMALLOC and __GFP_NORETRY - that would be sufficient to detect 
mempool_alloc usage and it wouldn't hurt other __GFP_NORETRY users.

Mikulas

> I believe that we should simply start with easier part and get rid of
> throttle_vm_writeout because that seems like a left over from the past.
> If that turns out unsatisfactory and we have clear picture when the
> throttling is harmful/suboptimal then we can move on with a more complex
> solution. Does this sound like a way forward?
> 
> -- 
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ