lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20101202093337.1573.A69D9226@jp.fujitsu.com>
Date:	Thu,  2 Dec 2010 11:44:54 +0900 (JST)
From:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	kosaki.motohiro@...fujitsu.com, Simon Kirby <sim@...tway.ca>,
	Mel Gorman <mel@....ul.ie>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: Free memory never fully used, swapping

> On Wed, 1 Dec 2010, KOSAKI Motohiro wrote:
> 
> > > Specifying a parameter to temporarily override to see if this has the
> > > effect is ok. But this has worked for years now. There must be something
> > > else going with with reclaim that causes these issues now.
> >
> > I don't think this has worked. Simon have found the corner case recently,
> > but it is not new.
> 
> What has worked? If the reduction of the maximum allocation order did not
> have the expected effect of fixing things here then the issue is not
> related to the higher allocations from slub.
> 
> Higher order allocations are not only a slub issue but a general issue for
> various subsystem that require higher order pages. This ranges from jumbo
> frames, to particular needs for certain device drivers, to huge pages.

Sure yes. However One big difference is there. Other user certinally need
such high order, but slub are using high order for only performance. but its
stragegy often shoot our own foot. It often makes worse than low order. IOW,
slub isn't always win against slab. 


> > So I hope you realize that high order allocation is no free lunch. __GFP_NORETRY
> > makes no sense really. Even though we have compaction, high order reclaim is still
> > costly operation.
> 
> Sure. There is a tradeoff between reclaim effort and the benefit of higher
> allocations. The costliness of reclaim may have increased with the recent
> changes to the reclaim logic. In fact reclaim gets more and more complex
> over time and there may be subtle bugs in there given the recent flurry of
> changes.

I can't insist reclaim is really complex. So maybe one of problem is now
reclaim can't know the request is must necessary or optimistic try. And,
allocation failure often makes disaster then we were working on fixint it.
But increasing high order allocation successful ratio sadly can makes slub
unhappy. umm..

So I think we have multiple option

1) reduce slub_max_order and slub only use safely order
2) slub don't invoke reclaim when high order tryal allocation 
   (ie turn off GFP_WAIT and turn on GFP_NOKSWAPD)
3) slub pass new hint to reclaim and reclaim don't work so aggressively if
   such hint is passwd.


So I have one question. I thought (2) is most nature. but now slub doesn't.
Why don't you do that? As far as I know, reclaim haven't been lighweight 
operation since linux was born. I'm curious your assumed cost threshold for
slub high order allocation.



> > I don't think SLUB's high order allocation trying is bad idea. but now It
> > does more costly trying. that's bad. Also I'm worry about SLUB assume too
> > higher end machine. Now Both SLES and RHEL decided to don't use SLUB,
> > instead use SLAB. Now linux community is fragmented. If you are still
> > interesting SL*B unification, can you please consider to join corner
> > case smashing activity?
> 
> The problems with higher order reclaim get more difficult with small
> memory sizes yes. We could reduce the maximum order automatically if memory
> is too tight. There is nothing hindering us from tuning the max order
> behavior of slub in a similar way that we now tune the thresholds of the
> vm statistics.

Sound like really good idea. :)



> But for that to be done we first need to have some feedback if the changes
> to max order have indeed the desired effect in this corner case.





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ