linux-kernel - Re: [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAC2o3DKO_weLt2n6hOwU=hJ9J4fc3Qa3mUHP7rMzksJVuGnsJA@mail.gmail.com>
Date:   Fri, 18 Dec 2020 21:20:33 +0800
From:   Fox Chen <foxhlchen@...il.com>
To:     Ian Kent <raven@...maw.net>
Cc:     Tejun Heo <tj@...nel.org>, Greg KH <gregkh@...uxfoundation.org>,
        akpm@...ux-foundation.org, dhowells@...hat.com,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        miklos@...redi.hu, ricklind@...ux.vnet.ibm.com,
        sfr@...b.auug.org.au, viro@...iv.linux.org.uk
Subject: Re: [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement

On Fri, Dec 18, 2020 at 7:21 PM Ian Kent <raven@...maw.net> wrote:
>
> On Fri, 2020-12-18 at 16:01 +0800, Fox Chen wrote:
> > On Fri, Dec 18, 2020 at 3:36 PM Ian Kent <raven@...maw.net> wrote:
> > > On Thu, 2020-12-17 at 10:14 -0500, Tejun Heo wrote:
> > > > Hello,
> > > >
> > > > On Thu, Dec 17, 2020 at 07:48:49PM +0800, Ian Kent wrote:
> > > > > > What could be done is to make the kernfs node attr_mutex
> > > > > > a pointer and dynamically allocate it but even that is too
> > > > > > costly a size addition to the kernfs node structure as
> > > > > > Tejun has said.
> > > > >
> > > > > I guess the question to ask is, is there really a need to
> > > > > call kernfs_refresh_inode() from functions that are usually
> > > > > reading/checking functions.
> > > > >
> > > > > Would it be sufficient to refresh the inode in the write/set
> > > > > operations in (if there's any) places where things like
> > > > > setattr_copy() is not already called?
> > > > >
> > > > > Perhaps GKH or Tejun could comment on this?
> > > >
> > > > My memory is a bit hazy but invalidations on reads is how sysfs
> > > > namespace is
> > > > implemented, so I don't think there's an easy around that. The
> > > > only
> > > > thing I
> > > > can think of is embedding the lock into attrs and doing xchg
> > > > dance
> > > > when
> > > > attaching it.
> > >
> > > Sounds like your saying it would be ok to add a lock to the
> > > attrs structure, am I correct?
> > >
> > > Assuming it is then, to keep things simple, use two locks.
> > >
> > > One global lock for the allocation and an attrs lock for all the
> > > attrs field updates including the kernfs_refresh_inode() update.
> > >
> > > The critical section for the global lock could be reduced and it
> > > changed to a spin lock.
> > >
> > > In __kernfs_iattrs() we would have something like:
> > >
> > > take the allocation lock
> > > do the allocated checks
> > >   assign if existing attrs
> > >   release the allocation lock
> > >   return existing if found
> > > othewise
> > >   release the allocation lock
> > >
> > > allocate and initialize attrs
> > >
> > > take the allocation lock
> > > check if someone beat us to it
> > >   free and grab exiting attrs
> > > otherwise
> > >   assign the new attrs
> > > release the allocation lock
> > > return attrs
> > >
> > > Add a spinlock to the attrs struct and use it everywhere for
> > > field updates.
> > >
> > > Am I on the right track or can you see problems with this?
> > >
> > > Ian
> > >
> >
> > umm, we update the inode in kernfs_refresh_inode, right??  So I guess
> > the problem is how can we protect the inode when kernfs_refresh_inode
> > is called, not the attrs??
>
> But the attrs (which is what's copied from) were protected by the
> mutex lock (IIUC) so dealing with the inode attributes implies
> dealing with the kernfs node attrs too.
>
> For example in kernfs_iop_setattr() the call to setattr_copy() copies
> the node attrs to the inode under the same mutex lock. So, if a read
> lock is used the copy in kernfs_refresh_inode() is no longer protected,
> it needs to be protected in a different way.
>

Ok, I'm actually wondering why the VFS holds exclusive i_rwsem for .setattr but
 no lock for .getattr (misdocumented?? sometimes they have as you've found out)?
What does it protect against?? Because .permission does a similar thing
here -- updating inode attributes, the goal is to provide the same
protection level
for .permission as for .setattr, am I right???


thanks,
fox