lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOQ4uxh8c2vbv50p8+rNnoV0H=L=+XRGuFP1dmGrrCrt6EjFYQ@mail.gmail.com>
Date:   Fri, 28 Oct 2022 12:32:16 +0300
From:   Amir Goldstein <amir73il@...il.com>
To:     Stephen Brennan <stephen.s.brennan@...cle.com>
Cc:     Jan Kara <jack@...e.cz>, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [PATCH v3 3/3] fsnotify: allow sleepable child flag update

On Fri, Oct 28, 2022 at 3:10 AM Stephen Brennan
<stephen.s.brennan@...cle.com> wrote:
>
> With very large d_subdirs lists, iteration can take a long time. Since
> iteration needs to hold parent->d_lock, this can trigger soft lockups.
> It would be best to make this iteration sleepable. Since we have the
> inode locked exclusive, we can drop the parent->d_lock and sleep,
> holding a reference to a child dentry, and continue iteration once we
> wake.
>
> Signed-off-by: Stephen Brennan <stephen.s.brennan@...cle.com>
> ---
>

Reviewed-by: Amir Goldstein <amir73il@...il.com>

some comment nits and one fortify suggestion

> Notes:
>     v3:
>     - removed if statements around dput()
>     v2:
>     - added a check for child->d_parent != alias and retry logic
>
>  fs/notify/fsnotify.c | 36 ++++++++++++++++++++++++++++++++----
>  1 file changed, 32 insertions(+), 4 deletions(-)
>
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index ccb8a3a6c522..34e5d18235a7 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -102,10 +102,12 @@ void fsnotify_sb_delete(struct super_block *sb)
>   * on a child we run all of our children and set a dentry flag saying that the
>   * parent cares.  Thus when an event happens on a child it can quickly tell
>   * if there is a need to find a parent and send the event to the parent.
> + *
> + * Context: inode locked exclusive
>   */
>  static bool __fsnotify_update_children_dentry_flags(struct inode *inode)
>  {
> -       struct dentry *alias, *child;
> +       struct dentry *child, *alias, *last_ref = NULL;
>         int watched;
>
>         if (!S_ISDIR(inode->i_mode))
> @@ -120,12 +122,37 @@ static bool __fsnotify_update_children_dentry_flags(struct inode *inode)
>         alias = d_find_any_alias(inode);
>
>         /*
> -        * run all of the children of the original inode and fix their
> -        * d_flags to indicate parental interest (their parent is the
> -        * original inode)
> +        * These lists can get very long, so we may need to sleep during
> +        * iteration. Normally this would be impossible without a cursor,
> +        * but since we have the inode locked exclusive, we're guaranteed
> +        * that the directory won't be modified, so whichever dentry we
> +        * pick to sleep on won't get moved. So, start a manual iteration
> +        * over d_subdirs which will allow us to sleep.
>          */
>         spin_lock(&alias->d_lock);
> +retry:
>         list_for_each_entry(child, &alias->d_subdirs, d_child) {
> +               if (need_resched()) {
> +                       /*
> +                        * We need to hold a reference while we sleep. But when
> +                        * we wake, dput() could free the dentry, invalidating
> +                        * the list pointers. We can't look at the list pointers
> +                        * until we re-lock the parent, and we can't dput() once
> +                        * we have the parent locked. So the solution is to hold
> +                        * onto our reference and free it the *next* time we drop
> +                        * alias->d_lock: either at the end of the function, or
> +                        * at the time of the next sleep.
> +                        */

My personal preference would be to move this above if (needed_reschd())
it is not any less clear when this comment is above the condition
and less indented will read nicer.

> +                       dget(child);
> +                       spin_unlock(&alias->d_lock);
> +                       dput(last_ref);
> +                       last_ref = child;
> +                       cond_resched();
> +                       spin_lock(&alias->d_lock);
> +                       if (child->d_parent != alias)
> +                               goto retry;

Is this expected? If not, then we need a WARN_ON_ONCE().
Also, I wonder if it would be better to break out and leave
dentry flags as they are instead of risking some endless
or very long retry loop?

And how about asserting on unexpected !list_empty(&child->d_child)
to avoid an endless loop in list_for_each_entry()?

Thanks,
Amir.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ