linux-kernel - Re: [lkp] [nfsd] 4aac1bf05b: -2.9% fsmark.files_per

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150930060359.070dba46@tlielax.poochiereds.net>
Date:	Wed, 30 Sep 2015 06:03:59 -0400
From:	Jeff Layton <jeff.layton@...marydata.com>
To:	"Huang\, Ying" <ying.huang@...ux.intel.com>
Cc:	kernel test robot <ying.huang@...el.com>, <lkp@...org>,
	LKML <linux-kernel@...r.kernel.org>, bfields@...ldses.org
Subject: Re: [lkp] [nfsd] 4aac1bf05b: -2.9% fsmark.files_per_sec

On Wed, 30 Sep 2015 16:35:58 +0800
"Huang\, Ying" <ying.huang@...ux.intel.com> wrote:

> Jeff Layton <jeff.layton@...marydata.com> writes:
> 
> > On Wed, 30 Sep 2015 07:27:54 +0800
> > "Huang\, Ying" <ying.huang@...ux.intel.com> wrote:
> >
> >> Jeff Layton <jeff.layton@...marydata.com> writes:
> >> 
> >> > On Mon, 28 Sep 2015 14:49:32 +0800
> >> > kernel test robot <ying.huang@...el.com> wrote:
> >> >
> >> >> FYI, we noticed the below changes on
> >> >> 
> >> >> =========================================================================================
> >> >> tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/iterations/nr_threads/disk/fs/fs2/filesize/test_size/sync_method/nr_directories/nr_files_per_directory:
> >> >>   lkp-ne04/fsmark/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/1x/32t/1HDD/xfs/nfsv4/5K/400M/fsyncBeforeClose/16d/256fpd
> >> >> 
> >> >> commit: 
> >> >>   cd2d35ff27c4fda9ba73b0aa84313e8e20ce4d2c
> >> >>   4aac1bf05b053a201a4b392dd9a684fb2b7e6103
> >> >> 
> >> >
> >> > A question...
> >> >
> >> > I think my tree should now contain a fix for this, but with a
> >> > performance regression like this it's difficult to know for sure.
> >> >
> >> > Is there some (automated) way to request that the KTR redo this test?
> >> > If not, will I get a note saying "problem seems to now be fixed" or do
> >> > I just take a lack of further emails from the KTR about this as a sign
> >> > that it's resolved?
> >> 
> >> Can you provide the branch name and commit ID for your tree with fix?  I
> >> can confirm whether it is fixed for you.
> >> 
> > Sure:
> >
> > git://git.samba.org/jlayton/linux.git nfsd-4.4
> >
> > The tip commit is ed3d7c1e01a76f5ecc7444067704a82af4c2f76e.
> >
> 
> It seems that the regression is fixed at that commit.  Thanks!
> 
> =========================================================================================
> tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/iterations/nr_threads/disk/fs/fs2/filesize/test_size/sync_method/nr_directories/nr_files_per_directory:
>   lkp-ne04/fsmark/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/1x/32t/1HDD/xfs/nfsv4/5K/400M/fsyncBeforeClose/16d/256fpd
> 
> commit: 
>   cd2d35ff27c4fda9ba73b0aa84313e8e20ce4d2c
>   4aac1bf05b053a201a4b392dd9a684fb2b7e6103
>   ed3d7c1e01a76f5ecc7444067704a82af4c2f76e
> 
> cd2d35ff27c4fda9 4aac1bf05b053a201a4b392dd9 ed3d7c1e01a76f5ecc74440677 
> ---------------- -------------------------- -------------------------- 
>          %stddev     %change         %stddev     %change         %stddev
>              \          |                \          |                \  
>   14415356 ±  0%      +2.6%   14788625 ±  1%      +4.1%   15008301 ±  0%  fsmark.app_overhead
>     441.60 ±  0%      -2.9%     428.80 ±  0%      -0.4%     439.68 ±  0%  fsmark.files_per_sec
>     185.78 ±  0%      +2.9%     191.26 ±  0%      +0.3%     186.37 ±  0%  fsmark.time.elapsed_time
>     185.78 ±  0%      +2.9%     191.26 ±  0%      +0.3%     186.37 ±  0%  fsmark.time.elapsed_time.max
>      97472 ±  0%      -2.8%      94713 ±  0%      -0.8%      96657 ±  0%  fsmark.time.involuntary_context_switches
> 
> Best Regards,
> Huang, Ying

Thanks for testing it and catching the problem in the first place!

FWIW, the problem seems to have been bad hash distribution generated by
hash_ptr on struct inode pointers. When the cache had ~10000 entries in
it total, one of the hash chains had almost 2000 entries. When I
switched to hashing on inode->i_ino, the distribution was much better.

I'm not sure if it was just rotten luck or there is something about
inode pointers that makes hash_ptr generate a lot of duplicates. That
really could use more investigation...

-- 
Jeff Layton <jeff.layton@...marydata.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/