linux-kernel - Re: tbench regression - Why process scheduler has impact on tbench and why small per-cpu slab (SLUB) cache creates the scenario?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1188983603.26438.55.camel@ymzhang>
Date:	Wed, 05 Sep 2007 17:13:23 +0800
From:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
To:	Christoph Lameter <clameter@....com>
Cc:	LKML <linux-kernel@...r.kernel.org>, mingo@...e.hu
Subject: Re: tbench regression - Why process scheduler has impact on tbench
	and why small per-cpu slab (SLUB) cache creates the scenario?

On Tue, 2007-09-04 at 23:58 -0700, Christoph Lameter wrote:
> On Wed, 5 Sep 2007, Zhang, Yanmin wrote:
> 
> > On Tue, 2007-09-04 at 20:59 -0700, Christoph Lameter wrote:
> > > On Wed, 5 Sep 2007, Zhang, Yanmin wrote:
> > > 
> > > > 8) kmalloc-4096 order is 1 which means one slab consists of 2 objects. So a
> > > 
> > > You can change that by booting with slub_max_order=0. Then we can also use 
> > > the per cpu queues to get these order 0 objects which may speed up the 
> > > allocations because we do not have to take zone locks on slab allocation.
> > > 
> > > Note also that Andrew's tree has a page allocator pass through for SLUB 
> > > for 4k kmallocs bypassing slab completely. That may also address the 
> > > issue.
> > > 
> > > If you want SLUB to handle more objects in the 4k kmalloc cache 
> > > without going to the page allocator then you can boot f.e. with
> > > 
> > > slub_max_order=3 slub_min_objects=8
> > I tried this approach. The testing result showed 2.6.23-rc4 is about
> > 2.5% better than 2.6.22. It really resovles the issue.
> > 
> > However, the approach treats the slabs in the same policy. Could we
> > implement a per-slab specific approach like direct b)?
> 
> I am not sure what you mean by same policy. Same configuration for all 
> slabs?
Yes.

> 
> > > Try the ways to address the issue that I mentioned above.
> > I really appreciate your kind comments!
> 
> Would it be possible to try the two other approaches that I suggested? I 
> think both of those may also solve the issue. Try booting with
> slab_max_order=0
1) I tried slab_max_order=0 and the regression becomes 12.5%. It's still not good.

2) I apllied patch slub-direct-pass-through-of-page-size-or-higher-kmalloc.patch
to kernel 2.6.23-rc4. The new testing result is much better, only 1% less than
2.6.22.

So the best solution is booting kernel with "slub_max_order=3 slub_min_objects=8".

>  and see what effect it has. The queues of the page 
> allocator can be much larger than what slab has for 4k pages. There is 
> really not much of a point in using a slab allocator for page sized 
> allocations.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/