[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <pehvvmy3vzimalic3isygd4d66j6tb6cnosoiu6xkgfjy3p3up@ikj4bhpmx4yt>
Date: Mon, 26 May 2025 18:47:38 +0200
From: Jan Kara <jack@...e.cz>
To: Sergey Senozhatsky <senozhatsky@...omium.org>
Cc: Jan Kara <jack@...e.cz>, Amir Goldstein <amir73il@...il.com>,
Matthew Bobrowski <repnop@...gle.com>, linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] fanotify: wake-up all waiters on release
On Mon 26-05-25 23:12:20, Sergey Senozhatsky wrote:
> On (25/05/26 14:52), Jan Kara wrote:
> > > > We don't use exclusive waits with access_waitq so wake_up() and
> > > > wake_up_all() should do the same thing?
> > >
> > > Oh, non-exclusive waiters, I see. I totally missed that, thanks.
> > >
> > > So... the problem is somewhere else then. I'm currently looking
> > > at some crashes (across all LTS kernels) where group owner just
> > > gets stuck and then hung-task watchdog kicks in and panics the
> > > system. Basically just a single backtrace in the kernel logs:
> > >
> > > schedule+0x534/0x2540
> > > fsnotify_destroy_group+0xa7/0x150
> > > fanotify_release+0x147/0x160
> > > ____fput+0xe4/0x2a0
> > > task_work_run+0x71/0xb0
> > > do_exit+0x1ea/0x800
> > > do_group_exit+0x81/0x90
> > > get_signal+0x32d/0x4e0
> > >
> > > My assumption was that it's this wait:
> > > wait_event(group->notification_waitq, !atomic_read(&group->user_waits));
> >
> > Well, you're likely correct we are sleeping in this wait. But likely
> > there's some process that's indeed waiting for response to fanotify event
> > from userspace. Do you have a reproducer? Can you dump all blocked tasks
> > when this happens?
>
> Unfortunately, no. This happens on consumer devices, which are
> not available for any sort of debugging, due to various privacy
> protection reasons. We only get anonymized kernel ramoops/dmesg
> on crashes.
>
> So my only option is to add something to the kernel, then roll-out
> the patched kernel to the fleet and wait for new crash reports. The
> problem is, all that I can think of sort of fixes the crash as far as
> the hung-task watchdog is concerned. Let me think more about it.
>
> Another silly question: what decrements group->user_waits in case of
> that race-condition?
>
> ---
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 9dac7f6e72d2b..38b977fe37a71 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -945,8 +945,10 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
> if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS)) {
> fsid = fanotify_get_fsid(iter_info);
> /* Racing with mark destruction or creation? */
> - if (!fsid.val[0] && !fsid.val[1])
> - return 0;
> + if (!fsid.val[0] && !fsid.val[1]) {
> + ret = 0;
> + goto finish;
> + }
> }
This code is not present in current upstream kernel. This seems to have
been inadvertedly fixed by commit 30ad1938326b ("fanotify: allow "weak" fsid
when watching a single filesystem") which you likely don't have in your
kernel.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists