[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260112-textil-bepflanzen-c6225a477747@brauner>
Date: Mon, 12 Jan 2026 10:45:49 +0100
From: Christian Brauner <brauner@...nel.org>
To: NeilBrown <neil@...wn.name>
Cc: Bernd Schubert <bernd@...ernd.com>,
Thorsten Leemhuis <linux@...mhuis.info>, Miklos Szeredi <miklos@...redi.hu>,
Linux kernel regressions list <regressions@...ts.linux.dev>, LKML <linux-kernel@...r.kernel.org>,
Linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [REGRESSION] fuse: xdg-document-portal gets stuck and causes
suspend to fail in mainline
On Mon, Jan 12, 2026 at 02:58:20PM +1100, NeilBrown wrote:
> On Mon, 12 Jan 2026, Bernd Schubert wrote:
> >
> > On 1/11/26 12:37, Thorsten Leemhuis wrote:
> > > Lo! I can reliably get xdg-document-portal stuck on latest -mainline
> > > (and -next, too; 6.18.4. works fine) trough the Signal flatpak, which
> > > then causes suspend to fail:
> > >
> > > """
> > >> [ 194.439381] PM: suspend entry (s2idle)
> > >> [ 194.454708] Filesystems sync: 0.015 seconds
> > >> [ 194.696767] Freezing user space processes
> > >> [ 214.700978] Freezing user space processes failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
> > >> [ 214.701143] task:xdg-document-po state:D stack:0 pid:2651 tgid:2651 ppid:1939 task_flags:0x400000 flags:0x00080002
> > >> [ 214.701151] Call Trace:
> > >> [ 214.701154] <TASK>
> > >> [ 214.701167] __schedule+0x2b8/0x5e0
> > >> [ 214.701181] schedule+0x27/0x80
> > >> [ 214.701188] request_wait_answer+0xce/0x260 [fuse]
> > >> [ 214.701202] ? __pfx_autoremove_wake_function+0x10/0x10
> > >> [ 214.701212] __fuse_simple_request+0x120/0x340 [fuse]
> > >> [ 214.701219] fuse_lookup_name+0xc3/0x210 [fuse]
> > >> [ 214.701235] fuse_lookup+0x99/0x1c0 [fuse]
> > >> [ 214.701242] ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [ 214.701247] ? fuse_dentry_init+0x23/0x50 [fuse]
> > >> [ 214.701257] lookup_one_qstr_excl+0xa8/0xf0
> >
> > Introduced by c9ba789dad15 ("VFS: introduce start_creating_noperm() and
> > start_removing_noperm()")?
> >
> > Why is the new code doing a lookup on an entry that is about to be
> > invalidated?
> >
> >
> > In order to handle this at least one fuse server process needs to be
> > available, but for this specific case the lookup still doesn't make sense.
> >
> > We could do something like this
> >
> > diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> > index 4b6b3d2758ff..7edbace7eddc 100644
> > --- a/fs/fuse/dir.c
> > +++ b/fs/fuse/dir.c
> > @@ -1599,6 +1599,15 @@ int fuse_reverse_inval_entry(struct fuse_conn
> > *fc, u64 parent_nodeid,
> > if (!dir)
> > goto put_parent;
> >
> > + /* Check dcache first - if not cached, nothing to invalidate */
> > + name->hash = full_name_hash(dir, name->name, name->len);
> > + entry = d_lookup(dir, name);
> > + if (!entry) {
> > + err = 0;
> > + dput(dir);
> > + goto put_parent;
> > + }
> > +
> > entry = start_removing_noperm(dir, name);
> > dput(dir);
> > if (IS_ERR(entry))
> >
> >
> > But let's assume the dentry exists - start_removing_noperm() will now
> > trigger a revalidate and get the same issue. From my point of view the
> > above commit should be reverted for fuse.
> >
> >
> > >> [ 214.701264] start_removing_noperm+0x59/0x80
> > >> [ 214.701268] ? d_find_alias+0x82/0xd0
> > >> [ 214.701273] fuse_reverse_inval_entry+0x7d/0x1f0 [fuse]
> > >> [ 214.701280] ? fuse_copy_do+0x5f/0xa0 [fuse]
> > >> [ 214.701287] fuse_notify+0x4a1/0x750 [fuse]
> > >> [ 214.701295] ? iov_iter_get_pages2+0x1d/0x40
> > >> [ 214.701301] ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [ 214.701305] fuse_dev_do_write+0x2e4/0x440 [fuse]
> > >> [ 214.701313] fuse_dev_write+0x6b/0xa0 [fuse]
> > >> [ 214.701320] do_iter_readv_writev+0x161/0x260
> > >> [ 214.701327] vfs_writev+0x168/0x3c0
> > >> [ 214.701334] ? ksys_write+0xcd/0xf0
> > >> [ 214.701338] ? do_writev+0x7f/0x110
> > >> [ 214.701341] do_writev+0x7f/0x110
> > >> [ 214.701344] do_syscall_64+0x7e/0x6b0
> > >> [ 214.701350] ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [ 214.701352] ? __handle_mm_fault+0x445/0x690
> > >> [ 214.701359] ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [ 214.701363] ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [ 214.701365] ? count_memcg_events+0xd6/0x210
> > >> [ 214.701371] ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [ 214.701373] ? handle_mm_fault+0x212/0x340
> > >> [ 214.701377] ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [ 214.701379] ? do_user_addr_fault+0x2b4/0x7b0
> > >> [ 214.701387] ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [ 214.701389] ? irqentry_exit+0x6d/0x540
> > >> [ 214.701393] ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [ 214.701395] ? exc_page_fault+0x7e/0x1a0
> > >> [ 214.701398] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > >> [ 214.701402] RIP: 0033:0x7f3c144f9982
> > >> [ 214.701467] RSP: 002b:00007fff80e2f388 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
> > >> [ 214.701470] RAX: ffffffffffffffda RBX: 00007f3bec000cf0 RCX: 00007f3c144f9982
> > >> [ 214.701472] RDX: 0000000000000003 RSI: 00007fff80e2f460 RDI: 0000000000000007
> > >> [ 214.701474] RBP: 00007fff80e2f3b0 R08: 0000000000000000 R09: 0000000000000000
> > >> [ 214.701475] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > >> [ 214.701477] R13: 00007f3bec000cf0 R14: 00007f3c14bb8280 R15: 00007f3be8001200
> > >> [ 214.701481] </TASK>
> > > """
> > >
> > > Killing the mentioned process using "kill -9" doesn't help. I can
> > > reliably trigger this in -mainline and -next using the Signal flatpak on
> > > Fedora 43 by trying to send a picture (which gets xdg-document-portal
> > > involved). It works the first time, but trying again won't and will
> > > cause Signal to get stuck for a few seconds. Works fine in 6.18.4.
> > >
> > > Is this maybe known already or does anybody have an idea what's wrong?
> > > If not I guess I'll have to bisect this.
> > >
> > > Ciao, Thorsten
> > >
> > > #regzbot introduced: v6.18..
> > > #regzbot title: fuse: xdg-document-portal gets stuck and causes suspend
> > > to fail
> > >
> > >
> >
> > Thanks,
> > Bernd
> >
>
> I post a fix
>
> https://lore.kernel.org/all/176454037897.634289.3566631742434963788@noble.neil.brown.name/
>
> a while ago. There was some talk in that thread of reverting the
> breaking change instead. I seems nothing happened.
I pinged a bunch of times but nobody ever responded.
So then let's just apply your patch. I picked it up.
Powered by blists - more mailing lists