[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20161206170719.GA2347@quack2.suse.cz>
Date: Tue, 6 Dec 2016 18:07:19 +0100
From: Jan Kara <jack@...e.cz>
To: Miklos Szeredi <miklos@...redi.hu>
Cc: Jan Kara <jack@...e.cz>, Amir Goldstein <amir73il@...il.com>,
Eric Paris <eparis@...hat.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: fsnotify_mark_srcu wtf?
On Fri 02-12-16 12:02:33, Miklos Szeredi wrote:
> On Fri, Dec 2, 2016 at 11:48 AM, Jan Kara <jack@...e.cz> wrote:
> > On Fri 02-12-16 09:26:51, Miklos Szeredi wrote:
> >> On Thu, Nov 10, 2016 at 8:46 PM, Jan Kara <jack@...e.cz> wrote:
> >> > On Wed 09-11-16 20:26:16, Amir Goldstein wrote:
> >> >> On Wed, Nov 9, 2016 at 1:10 PM, Jan Kara <jack@...e.cz> wrote:
> >>
> >> >> > And this does not work as well... Fanotify must notify groups by their
> >> >> > priority so you cannot arbitrarily reorder ordering in which groups get
> >> >> > notified. I'm currently pondering on using mark refcount to pin it when
> >> >> > processing permission event but there are still some details to check.
> >> >> >
> >> >>
> >> >> All right, mark refcount sound like the proper solution.
> >> >
> >> > Except it doesn't quite work. We can pin the current marks by a refcount
> >> > but they can still be removed from the list so after we regain srcu lock,
> >> > we are not sure their ->next pointers still point to still allocated marks
> >> > :-| Sadly I realized this only after implementing all this.
> >>
> >> Hmm, how about this: when removing mark from inode, drop refcount. If
> >> refcount is zero can remove from list. Otherwise mark the mark "dead"
> >> and leave it on the list.
> >>
> >> And fsnotify can just skip dead marks.
> >
> > I had this idea as well and when trying to implement this, I've stumbled
> > over some problems. I think the biggest problem was that destruction of a
> > notification mark is relatively complex operation (doing iput() for
> > example) and quite a few places dropping mark references are in a context
> > where this can cause problems. Also I don't want to defer iput() to a
> > workqueue as that will have unexpected consequences such as unlinked
> > watched inode lingering in the system (possibly colliding with umount
> > etc.).
>
> Okay, but all we need from the deleted mark is the ->next pointer, no?
> So everything else related to destruction can be done.
Yes, all we need for continuing the iteration itself is a ->next pointer and
possibly a ->group pointer. However the list itself is starting at the
inode->i_fsnotify_marks and the removal of the mark from the list needs to
be protected by inode->i_lock so we do need inode reference while the mark
is still attached to the inode list. So more plausible looks to wait while
detaching mark from the object for all responses for that mark.
Another possibility would be to create separate object for the head of the
list of fsnotify marks (containing also the lock for the list). That would
allow us to detach list of marks from the object and thus we should not
have issues with conflicting lifetime needs... And the additional memory
overhead (about 3 longs) is not that big compared to the size of fsnotify
mark (currently 9 longs). We'll see.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists