[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20151001073503.26a871cc@tlielax.poochiereds.net>
Date: Thu, 1 Oct 2015 07:35:03 -0400
From: Jeff Layton <jeff.layton@...marydata.com>
To: Dave Chinner <david@...morbit.com>
Cc: "Huang, Ying" <ying.huang@...ux.intel.com>,
kernel test robot <ying.huang@...el.com>, lkp@...org,
LKML <linux-kernel@...r.kernel.org>, bfields@...ldses.org
Subject: Re: [lkp] [nfsd] 4aac1bf05b: -2.9% fsmark.files_per_sec
On Thu, 1 Oct 2015 09:17:42 +1000
Dave Chinner <david@...morbit.com> wrote:
> On Wed, Sep 30, 2015 at 06:03:59AM -0400, Jeff Layton wrote:
> > Thanks for testing it and catching the problem in the first place!
> >
> > FWIW, the problem seems to have been bad hash distribution generated by
> > hash_ptr on struct inode pointers. When the cache had ~10000 entries in
> > it total, one of the hash chains had almost 2000 entries. When I
> > switched to hashing on inode->i_ino, the distribution was much better.
> >
> > I'm not sure if it was just rotten luck or there is something about
> > inode pointers that makes hash_ptr generate a lot of duplicates. That
> > really could use more investigation...
>
> Inode pointers have no entropy in the lower 9-10 bits because of
> their size, and being allocated from a slab they are all going to
> have the same set of values in the next 3-4 bits (i.e. offset into
> the slab page which is defined by sizeof(inode)). Pointers also
> have very similar upper bits, too, because they are all in kernel
> memory.
>
> hash_64 trys to fold all the entropy from the lower bits into into
> the upper bits and then takes the result from the upper bits. Hence
> if there is no entropy in either the lower or upper bits to start
> with, then the hash may not end up with much entropy in it at all...
>
> FWIW, see fs/inode.c::hash() to see how the fs code hashes inode
> numbers (called from insert_inode_hash()). It's very different
> because because inode numbers have the majority of their entropy in
> the lower bits and (usually) none in the upper bits...
>
Thanks for the explanation, Dave. That makes sense.
In hindsight I should have looked at how the vfs code hashes inodes in
its hashtable. Given that we're basically creating "shadow" inode
structures here that would probably work fairly well.
--
Jeff Layton <jeff.layton@...marydata.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists