lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 5 Nov 2016 22:43:45 +0100
From:   Jan Kara <jack@...e.cz>
To:     Amir Goldstein <amir73il@...il.com>
Cc:     Miklos Szeredi <miklos@...redi.hu>, Jan Kara <jack@...e.cz>,
        Eric Paris <eparis@...hat.com>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: fsnotify_mark_srcu wtf?

On Thu 03-11-16 12:25:13, Amir Goldstein wrote:
> On Thu, Nov 3, 2016 at 12:09 AM, Miklos Szeredi <miklos@...redi.hu> wrote:
> > We've got a report where a fanotify daemon that implements permission checks
> > screws up and doesn't send a reply.  This then causes widespread hangs due to
> > fsnotify_mark_srcu read side lock being held and thus causing synchronize_srcu()
> > called from e.g. inotify_release()-> fsnotify_destroy_group()->
> > fsnotify_mark_destroy_list() to block.
> >
> > Below program demonstrates the issue.  It should output a single line:
> >
> > close(inotify_fd): success
> >
> > Instead it outputs nothing, which means that close(inotify_fd) got blocked by
> > the waiting permission event.
> >
> > Wouldn't making the srcu per-group fix this?  Would that be too expensive?
> >
> 
> IIUC, the purpose of fsnotify_mark_srcu is to protect the inode and vfsmount
> mark lists, which are used to iterate the groups to send events to.
> So you cannot make it per group as far as I can tell.
> 
> OTOH, specifically, for fanotify, once you can passed
> fanotify_should_send_event(),
> there is no need to keep a reference to the mark, so it may be safely destroyed,
> you only need a reference to the group.
> 
> I tested this patch on top of my fanotify super block watch development branch
> and it seems to fix the problem you reported, but I am not savvy enough with
> srcu to say that this is correct.
> Jan, what do you think?

So the way you've written it is definitely buggy. The inode_node and
vfsmount_node pointers may become invalid once you drop SRCU lock and so
you cannot dereference them even after regaining that lock. Also frankly
your solution looks a bit ugly. I'll think whether we can somehow fix the
problem...

								Honza

> 
> From 28da34cdf9bf71fe9bbac13ded11a19da3b7a48e Mon Sep 17 00:00:00 2001
> From: Amir Goldstein <amir73il@...il.com>
> Date: Thu, 3 Nov 2016 11:55:46 +0200
> Subject: [PATCH] fanotify: handle permission events without holding
>  fsnotify_mark_srcu
> 
> Handling fanotify events does not entail dereferencing fsnotify_mask
> beyond the point of fanotify_should_send_event().
> 
> For the case of permission events, which may block indefinetely,
> return -EAGAIN and then fsnotify() will call handle_event() again
> without a reference to the mark.
> 
> Without a reference to the mark, there is no need to call
> handle_event() under fsnotify_mark_srcu read side lock
> which is protecting the inode/vfsmount mark lists.
> We just need to hold a reference to the group
> 
> Signed-off-by: Amir Goldstein <amir73il@...il.com>
> ---
>  fs/notify/fanotify/fanotify.c | 14 +++++++++++---
>  fs/notify/fsnotify.c          | 26 ++++++++++++++++++++++----
>  2 files changed, 33 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 604e307..0ffa9ed 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -298,9 +298,17 @@ static int fanotify_handle_event(struct
> fsnotify_group *group,
>         BUILD_BUG_ON(FAN_ACCESS_PERM != FS_ACCESS_PERM);
>         BUILD_BUG_ON(FAN_ONDIR != FS_ISDIR);
> 
> -       if (!fanotify_should_send_event(inode_mark, vfsmnt_mark, mask, data
> -                                       data_type))
> -               return 0;
> +       if (inode_mark || vfsmnt_mark) {
> +               if (!fanotify_should_send_event(inode_mark, vfsmnt_mark, mask,
> +                                               data, data_type))
> +                       return 0;
> +
> +#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
> +               /* Ask to be called again without a reference to mark */
> +               if (mask & FAN_ALL_PERM_EVENTS)
> +                       return -EAGAIN;
> +#endif
> +       }
> 
>         pr_debug("%s: group=%p inode=%p mask=%x\n", __func__, group, inode,
>                  mask);
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 34bccbe..dc0c86e 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -199,7 +199,7 @@ int fsnotify(struct inode *to_tell, __u32 mask,
> void *data, int data_is,
>  {
>         struct hlist_node *inode_node = NULL, *vfsmount_node = NULL;
>         struct fsnotify_mark *inode_mark = NULL, *vfsmount_mark = NULL;
> -       struct fsnotify_group *inode_group, *vfsmount_group;
> +       struct fsnotify_group *inode_group, *vfsmount_group, *group;
>         struct mount *mnt;
>         int idx, ret = 0;
>         /* global tests shouldn't care about events on child only the
> specific event */
> @@ -282,15 +282,33 @@ int fsnotify(struct inode *to_tell, __u32 mask,
> void *data, int data_is,
>                 ret = send_to_group(to_tell, inode_mark, vfsmount_mark, mask,
>                                     data, data_is, cookie, file_name);
> 
> -               if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
> -                       goto out;
> -
>                 if (inode_group)
>                         inode_node = srcu_dereference(inode_node->next,
>                                                       &fsnotify_mark_srcu);
>                 if (vfsmount_group)
>                         vfsmount_node = srcu_dereference(vfsmount_node->next,
>                                                          &fsnotify_mark_srcu);
> +
> +               if (ret != -EAGAIN)
> +                       continue;
> +
> +               group = inode_group ? : vfsmount_group;
> +               fsnotify_get_group(group);
> +               srcu_read_unlock(&fsnotify_mark_srcu, idx);
> +               /*
> +                * Calling handle_event() again without reference to mark,
> +                * so we do not need to call it under fsnotify_mark_srcu
> +                * which is protecting the inode/vfsmount mark lists.
> +                * We just need to hold a reference to the group
> +                */
> +               return group->ops->handle_event(group, to_tell, NULL, NULL,
> +                                               mask, data, data_is,
> +                                               file_name, cookie);
> +               idx = srcu_read_lock(&fsnotify_mark_srcu);
> +               fsnotify_put_group(group);
> +
> +               if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
> +                       goto out;
>         }
>         ret = 0;
>  out:
> -- 
> 2.7.4
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ