linux-kernel - Re: Latest vfs scalability patch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20091015114119.GE3127@wotan.suse.de>
Date:	Thu, 15 Oct 2009 13:41:19 +0200
From:	Nick Piggin <npiggin@...e.de>
To:	Anton Blanchard <anton@...ba.org>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-fsdevel@...r.kernel.org,
	Ravikiran G Thirumalai <kiran@...lex86.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jens Axboe <axboe@...nel.dk>
Subject: Re: Latest vfs scalability patch

On Thu, Oct 15, 2009 at 10:23:29PM +1100, Anton Blanchard wrote:
>  
> Hi Nick,
> 
> > I wonder what other good performance tests you can add to your test
> > framework? creat/unlink is another easy one. And for each case, putting
> > threads in their own cwd versus a common cwd are the variants.
> 
> I did try the two combinations of creat/unlink but haven't had a chance to
> digest the profiles yet. I've attached them (taken at 64 cores, ie worst
> case :)
> 
> In both cases performance was significantly better than mainline.
> 
> > BTW. for these cases in your tests it will be nice if you can run on
> > ramfs because that will isolate purely the vfs. Perhaps also include
> > other filesystems as you get time, but I think ramfs is the most
> > useful for us to start with.
> 
> Good point. I'll add that into the setup scripts.
> 
> Anton

> # Samples: 82617
> #
> # Overhead          Command                      Shared Object  Symbol
> # ........  ...............  .................................  ......
> #
>     99.16%  unlink1_process  [kernel]                           [k] ._spin_lock
>                 |          
>                 |--99.98%-- ._spin_lock
>                 |          |          
>                 |          |--49.80%-- .path_get
>                 |          |--49.58%-- .dput

Hmm, both your profiles look like they are hammering on a common cwd
here. The lock-free path walk can probably be extended to help a bit,
but you would still end up hitting locks on the parent dentry/inode
when doing the create destroy. My 64-way numbers look like this:


create-unlink 1 processes seperate-cwd 105306.58 ops/s
create-unlink 2 processes seperate-cwd 103004.20 ops/s
create-unlink 4 processes seperate-cwd 92438.69 ops/s
create-unlink 8 processes seperate-cwd 91138.93 ops/s
create-unlink 16 processes seperate-cwd 91025.36 ops/s
create-unlink 32 processes seperate-cwd 83757.75 ops/s
create-unlink 64 processes seperate-cwd 81718.29 ops/s

create-unlink 1 processes same-cwd 110139.61 ops/s
create-unlink 2 processes same-cwd 26611.69 ops/s
create-unlink 4 processes same-cwd 13819.46 ops/s
create-unlink 8 processes same-cwd 4724.83 ops/s
create-unlink 16 processes same-cwd 1368.99 ops/s
create-unlink 32 processes same-cwd 335.08 ops/s
create-unlink 64 processes same-cwd 114.88 ops/s

If your seperate-cwd numbers aren't scaling reasonably well, I will
have to get to the bottom of it.

BTW these numbers are ops/s/cpu which btw in some cases I like better.
At a quick glance it is very easy to see if scalability is linear,
and also when you graph it then there is less tendency to drown
out the low end numbers (although it could go the other way and drown
the high end numbers for non-scalable cases). Maybe you could add an
option to output either.

Thanks,
Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/