lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Nov 2017 11:52:48 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     NeilBrown <neilb@...e.com>
Cc:     Al Viro <viro@...iv.linux.org.uk>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/3] Improve fairness when locking the per-superblock
 s_anon list

On Wed, Nov 8, 2017 at 7:22 PM, NeilBrown <neilb@...e.com> wrote:
> bit-spin-locks, as used for dcache hash chains, are not fair.
> This is not a problem for the dcache hash table as different CPUs are
> likely to access different entries in the hash table so high contention
> is not expected.
> However anonymous dentryies (created by NFSD) all live on a single hash
> chain "s_anon" and the bitlock on this can be highly contended, resulting
> in soft-lockup warnings.

This really seems idiotic.

Let me rephrase this commit message so that you can see why I think it's wrong:

 "NFSd uses a single hash chain for all dentries, which can cause
horrible lock contention, in ways that the normal hashing does not.

  This horrendous contention can cause the machine to have bad latency
spikes, which can cause soft lockup warnings.

  Instead of just fixing the bad design decision that causes this
horrible contention, we'll just special-case this code and use an
additional lock that hides the problem by serializing the actual
contending access".

Honestly, looking at the code, the whole s_anon thing seems entirely
broken. There doesn't even seem to be much reason for it. In pretty
much all cases, we could just hash the damn dentry,

The only reason for actually having s_anon seems to be that we want
some per-superblock list of these unconnected dentries for
shrink_dcache_for_umount().

Everything else would actually be *much* happier with just having the
dentry on the regular hash table. It would entirely get rid of this
stupid performance problem, and it would actually simplify all the
code elsewhere, because it would remove special cases like this

                if (unlikely(IS_ROOT(dentry)))
                        b = &dentry->d_sb->s_anon;
                else
                        b = d_hash(dentry->d_name.hash);

and just turn them into

                b = d_hash(dentry->d_name.hash);

so I really wonder if we could just get rid of s_anon entirely.

Yes, getting rid of s_anon might involve crazy things like "let's just
walk all the dentries at umount time", but honestly, that sounds
preferable. Especially if we can just then do something like

 - set a special flag in the superblock if we ever use __d_obtain_alias()

 - only scan all the dentries on umount if that flag is set.

Hmm?

The other alternative would be to just try to make the bitlocks
fairer. I would find that much less distasteful than this nasty hack.

I could imagine, for example, just turning the bitlocks into "spinlock
on hashed array". That's basically what we did with the page lock,
which _used_ to be basically a blocking bitlock, and where there was
no possible way to not have contention on some pages.

We could afford to have two bits for the bitlock (lock and contention)
to make something like that more viable.

But I really get the feeling that the problem here is not the locking
primitive, but that whole s_anon decision. And my gut feeling is that
it really should be fixable.

Because wouldn't it be much nicer if the horrible nfsd contention just
went away rather than be worked around?

But maybe I'm missing some _other_ reason why s_anon has to be its own
separate list?

          Linus

Powered by blists - more mailing lists