lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 2 Nov 2022 18:52:24 +0100
From:   Jan Kara <jack@...e.cz>
To:     Stephen Brennan <stephen.s.brennan@...cle.com>
Cc:     Jan Kara <jack@...e.cz>, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, Amir Goldstein <amir73il@...il.com>,
        Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [PATCH v3 0/3] fsnotify: fix softlockups iterating over d_subdirs

On Tue 01-11-22 13:48:54, Stephen Brennan wrote:
> Jan Kara <jack@...e.cz> writes:
> > Hi Stephen!
> >
> > On Thu 27-10-22 17:10:13, Stephen Brennan wrote:
> >> Here is v3 of the patch series. I've taken all of the feedback,
> >> thanks Amir, Christian, Hilf, et al. Differences are noted in each
> >> patch.
> >> 
> >> I caught an obvious and silly dentry reference leak: d_find_any_alias()
> >> returns a reference, which I never called dput() on. With that change, I
> >> no longer see the rpc_pipefs issue, but I do think I need more testing
> >> and thinking through the third patch. Al, I'd love your feedback on that
> >> one especially.
> >> 
> >> Thanks,
> >> Stephen
> >> 
> >> Stephen Brennan (3):
> >>   fsnotify: Use d_find_any_alias to get dentry associated with inode
> >>   fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem
> >>   fsnotify: allow sleepable child flag update
> >
> > Thanks for the patches Stephen and I'm sorry for replying somewhat late.
> 
> Absolutely no worries, these things take time. Thanks for taking a look!
> 
> > The first patch is a nobrainer. The other two patches ... complicate things
> > somewhat more complicated than I'd like. I guess I can live with them if we
> > don't find a better solution but I'd like to discuss a bit more about
> > alternatives.
> 
> Understood!
> 
> > So what would happen if we just clear DCACHE_FSNOTIFY_PARENT_WATCHED in
> > __fsnotify_parent() for the dentry which triggered the event and does not
> > have watched parent anymore and never bother with full children walk? I
> > suppose your contention problems will be gone, we'll just pay the price of
> > dget_parent() + fsnotify_inode_watches_children() for each child that
> > falsely triggers instead of for only one. Maybe that's not too bad? After
> > all any event upto this moment triggered this overhead as well...
> 
> This is an interesting idea. It came across my mind but I don't think I
> considered it seriously because I assumed that it was too big a change.
> But I suppose in the process I created an even bigger change :P
> 
> The false positive dget_parent() + fsnotify_inode_watches_children()
> shouldn't be too bad. I could see a situation where there's a lot of
> random accesses within a directory, where the dget_parent() could cause
> some contention over the parent dentry. But to be fair, the performance
> would have been the same or worse while fsnotify was active in that
> case, and the contention would go away as most of the dentries get their
> flags cleared. So I don't think this is a problem.
> 
> > Am I missing something?
> 
> I think there's one thing missed here. I understand you'd like to get
> rid of the extra flag in the connector. But the advantage of the flag is
> avoiding duplicate work by saving a bit of state. Suppose that a mark is
> added to a connector, which causes fsnotify_inode_watches_children() to
> become true. Then, any subsequent call to fsnotify_recalc_mask() must
> call __fsnotify_update_child_dentry_flags(), even though the child
> dentry flags don't need to be updated: they're already set. For (very)
> large directories, this can take a few seconds, which means that we're
> doing a few extra seconds of work each time a new mark is added to or
> removed from a connector in that case. I can't imagine that's a super
> common workload though, and I don't know if my customers do that (my
> guess would be no).

I understand. This basically matters for fsnotify_recalc_mask(). As a side
note I've realized that your changes to fsnotify_recalc_mask() acquiring
inode->i_rwsem for updating dentry flags in patch 2/3 are problematic for
dnotify because that calls fsnotify_recalc_mask() under a spinlock.
Furthermore it is somewhat worrying also for inotify & fanotify because it
nests inode->i_rwsem inside fsnotify_group->lock however I'm not 100% sure
something doesn't force the ordering the other way around (e.g. the removal
of oneshot mark during modify event generation). Did you run tests with
lockdep enabled?

Anyway, if the lock ordering issues can be solved, I suppose we can
optimize fsnotify_recalc_mask() like:

	inode_lock(inode);
	spin_lock(&conn->lock);
	oldmask = inode->i_fsnotify_mask;
	__fsnotify_recalc_mask(conn);
	newmask = inode->i_fsnotify_mask;
	spin_unlock(&conn->lock);
	if (watching children changed(oldmask, newmask))
		__fsnotify_update_child_dentry_flags(...)
	inode_unlock(inode);

And because everything is serialized by inode_lock, we don't have to worry
about inode->i_fsnotify_mask and dentry flags getting out of sync or some
mark addition returning before all children are marked for reporting
events. No need for the connector flag AFAICT.

But the locking issue needs to be resolved first in any case. I need to
think some more...

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ