lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221107115624.j2ilaotaigrqiheu@quack3>
Date:   Mon, 7 Nov 2022 12:56:24 +0100
From:   Jan Kara <jack@...e.cz>
To:     Stephen Brennan <stephen.s.brennan@...cle.com>
Cc:     Jan Kara <jack@...e.cz>, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, Amir Goldstein <amir73il@...il.com>,
        Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [PATCH v3 0/3] fsnotify: fix softlockups iterating over d_subdirs

On Fri 04-11-22 16:33:18, Stephen Brennan wrote:
> Jan Kara <jack@...e.cz> writes:
> > On Tue 01-11-22 13:48:54, Stephen Brennan wrote:
> >> Jan Kara <jack@...e.cz> writes:
> >> > Hi Stephen!
> >> >
> >> > On Thu 27-10-22 17:10:13, Stephen Brennan wrote:
> >> >> Here is v3 of the patch series. I've taken all of the feedback,
> >> >> thanks Amir, Christian, Hilf, et al. Differences are noted in each
> >> >> patch.
> >> >> 
> >> >> I caught an obvious and silly dentry reference leak: d_find_any_alias()
> >> >> returns a reference, which I never called dput() on. With that change, I
> >> >> no longer see the rpc_pipefs issue, but I do think I need more testing
> >> >> and thinking through the third patch. Al, I'd love your feedback on that
> >> >> one especially.
> >> >> 
> >> >> Thanks,
> >> >> Stephen
> >> >> 
> >> >> Stephen Brennan (3):
> >> >>   fsnotify: Use d_find_any_alias to get dentry associated with inode
> >> >>   fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem
> >> >>   fsnotify: allow sleepable child flag update
> >> >
> >> > Thanks for the patches Stephen and I'm sorry for replying somewhat late.
> >> 
> >> Absolutely no worries, these things take time. Thanks for taking a look!
> >> 
> >> > The first patch is a nobrainer. The other two patches ... complicate things
> >> > somewhat more complicated than I'd like. I guess I can live with them if we
> >> > don't find a better solution but I'd like to discuss a bit more about
> >> > alternatives.
> >> 
> >> Understood!
> >> 
> >> > So what would happen if we just clear DCACHE_FSNOTIFY_PARENT_WATCHED in
> >> > __fsnotify_parent() for the dentry which triggered the event and does not
> >> > have watched parent anymore and never bother with full children walk? I
> >> > suppose your contention problems will be gone, we'll just pay the price of
> >> > dget_parent() + fsnotify_inode_watches_children() for each child that
> >> > falsely triggers instead of for only one. Maybe that's not too bad? After
> >> > all any event upto this moment triggered this overhead as well...
> >> 
> >> This is an interesting idea. It came across my mind but I don't think I
> >> considered it seriously because I assumed that it was too big a change.
> >> But I suppose in the process I created an even bigger change :P
> >> 
> >> The false positive dget_parent() + fsnotify_inode_watches_children()
> >> shouldn't be too bad. I could see a situation where there's a lot of
> >> random accesses within a directory, where the dget_parent() could cause
> >> some contention over the parent dentry. But to be fair, the performance
> >> would have been the same or worse while fsnotify was active in that
> >> case, and the contention would go away as most of the dentries get their
> >> flags cleared. So I don't think this is a problem.
> >> 
> >> > Am I missing something?
> >> 
> >> I think there's one thing missed here. I understand you'd like to get
> >> rid of the extra flag in the connector. But the advantage of the flag is
> >> avoiding duplicate work by saving a bit of state. Suppose that a mark is
> >> added to a connector, which causes fsnotify_inode_watches_children() to
> >> become true. Then, any subsequent call to fsnotify_recalc_mask() must
> >> call __fsnotify_update_child_dentry_flags(), even though the child
> >> dentry flags don't need to be updated: they're already set. For (very)
> >> large directories, this can take a few seconds, which means that we're
> >> doing a few extra seconds of work each time a new mark is added to or
> >> removed from a connector in that case. I can't imagine that's a super
> >> common workload though, and I don't know if my customers do that (my
> >> guess would be no).
> >
> > I understand. This basically matters for fsnotify_recalc_mask(). As a side
> > note I've realized that your changes to fsnotify_recalc_mask() acquiring
> > inode->i_rwsem for updating dentry flags in patch 2/3 are problematic for
> > dnotify because that calls fsnotify_recalc_mask() under a spinlock.
> > Furthermore it is somewhat worrying also for inotify & fanotify because it
> > nests inode->i_rwsem inside fsnotify_group->lock however I'm not 100% sure
> > something doesn't force the ordering the other way around (e.g. the removal
> > of oneshot mark during modify event generation). Did you run tests with
> > lockdep enabled?
> 
> No I didn't. I'll be sure to get the LTP tests running with lockdep
> early next week and try this series out, we'll probably get an error
> like you say.
> 
> I'll also take a look at the dnotify use case and see if there's
> anything to do there. Hopefully there's something we can do to salvage
> it :D

Yeah, dnotify should be solvable. I'm even willing to accept somewhat
unusual methods for dnotify ;). It is ancient and rarely used API these
days so I don't want it to block fixes for newer APIs. From a quick look it
should be enough to move fsnotify_recalc_mask() call in
dnotify_recalc_inode_mask() from under the mark->lock out into callers and
perhaps rename dnotify_recalc_inode_mask() to dnotify_recalc_mark_mask()
to make the name somewhat less confusing.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ