[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOQ4uxgG=E+3CwpQAE__YGt7vdW77n0nTe6cExPTERBNUN0a0g@mail.gmail.com>
Date: Wed, 2 Nov 2022 10:55:01 +0200
From: Amir Goldstein <amir73il@...il.com>
To: Stephen Brennan <stephen.s.brennan@...cle.com>
Cc: Jan Kara <jack@...e.cz>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [PATCH v3 0/3] fsnotify: fix softlockups iterating over d_subdirs
On Tue, Nov 1, 2022 at 10:49 PM Stephen Brennan
<stephen.s.brennan@...cle.com> wrote:
>
> Hi Jan,
>
> Jan Kara <jack@...e.cz> writes:
> > Hi Stephen!
> >
> > On Thu 27-10-22 17:10:13, Stephen Brennan wrote:
> >> Here is v3 of the patch series. I've taken all of the feedback,
> >> thanks Amir, Christian, Hilf, et al. Differences are noted in each
> >> patch.
> >>
> >> I caught an obvious and silly dentry reference leak: d_find_any_alias()
> >> returns a reference, which I never called dput() on. With that change, I
> >> no longer see the rpc_pipefs issue, but I do think I need more testing
> >> and thinking through the third patch. Al, I'd love your feedback on that
> >> one especially.
> >>
> >> Thanks,
> >> Stephen
> >>
> >> Stephen Brennan (3):
> >> fsnotify: Use d_find_any_alias to get dentry associated with inode
> >> fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem
> >> fsnotify: allow sleepable child flag update
> >
> > Thanks for the patches Stephen and I'm sorry for replying somewhat late.
>
> Absolutely no worries, these things take time. Thanks for taking a look!
>
> > The first patch is a nobrainer. The other two patches ... complicate things
> > somewhat more complicated than I'd like. I guess I can live with them if we
> > don't find a better solution but I'd like to discuss a bit more about
> > alternatives.
>
> Understood!
>
> > So what would happen if we just clear DCACHE_FSNOTIFY_PARENT_WATCHED in
> > __fsnotify_parent() for the dentry which triggered the event and does not
> > have watched parent anymore and never bother with full children walk? I
> > suppose your contention problems will be gone, we'll just pay the price of
> > dget_parent() + fsnotify_inode_watches_children() for each child that
> > falsely triggers instead of for only one. Maybe that's not too bad? After
> > all any event upto this moment triggered this overhead as well...
>
> This is an interesting idea. It came across my mind but I don't think I
> considered it seriously because I assumed that it was too big a change.
> But I suppose in the process I created an even bigger change :P
>
> The false positive dget_parent() + fsnotify_inode_watches_children()
> shouldn't be too bad. I could see a situation where there's a lot of
> random accesses within a directory, where the dget_parent() could cause
> some contention over the parent dentry. But to be fair, the performance
> would have been the same or worse while fsnotify was active in that
> case, and the contention would go away as most of the dentries get their
> flags cleared. So I don't think this is a problem.
>
I also don't think that is a problem.
> > Am I missing something?
>
> I think there's one thing missed here. I understand you'd like to get
> rid of the extra flag in the connector. But the advantage of the flag is
> avoiding duplicate work by saving a bit of state. Suppose that a mark is
> added to a connector, which causes fsnotify_inode_watches_children() to
> become true. Then, any subsequent call to fsnotify_recalc_mask() must
> call __fsnotify_update_child_dentry_flags(), even though the child
> dentry flags don't need to be updated: they're already set. For (very)
> large directories, this can take a few seconds, which means that we're
> doing a few extra seconds of work each time a new mark is added to or
> removed from a connector in that case. I can't imagine that's a super
> common workload though, and I don't know if my customers do that (my
> guess would be no).
>
> For an example of a test workload where this would make a difference:
> one of my test cases is to create a directory with millions of negative
> dentries, and then run "for i in {1..20}; do inotifywait $DIR & done".
> With the series as-is, only the first task needs to do the child flag
> update. With your proposed alternative, each task would re-do the
> update.
>
> If we were to manage to get __fsnotify_update_child_dentry_flags() to
> work correctly, and safely, with the ability to drop the d_lock and
> sleep, then I think that wouldn't be too much of a problem, because then
> other spinlock users of the directory will get a chance to grab it, and
> so we don't risk softlockups. Without the sleepable iterations, it would
> be marginally worse than the current solution, but I can't really
> comment _how_ much worse because like I said, it doesn't sound like a
> frequent usage pattern.
>
> I think I have a _slight_ preference for the current design, but I see
> the appeal of simpler code, and your design would still improve things a
> lot! Maybe Amir has an opinion too, but of course I'll defer to what you
> want.
>
IIUC, patches #1+#3 fix a reproducible softlock, so if Al approves them,
I see no reason to withhold.
With patches #1+#3 applied, the penalty is restricted to adding or
removing/destroying multiple marks on very large dirs or dirs with
many negative dentries.
I think that fixing the synthetic test case of multiple added marks
is rather easy even without DCACHE_FSNOTIFY_PARENT_WATCHED.
Something like the attached patch is what Jan had suggested in the
first place.
The synthetic test case of concurrent add/remove watch on the same
dir may still result in multiple children iterations, but that will usually
not be avoided even with DCACHE_FSNOTIFY_PARENT_WATCHED
and probably not worth optimizing for.
Thoughts?
Thanks,
Amir.
View attachment "fsnotify-clear-PARENT_WATCHED-flags-lazily.patch" of type "text/x-patch" (6791 bytes)
Powered by blists - more mailing lists