lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <787edd09-6f86-406c-a466-083366967723@bsbernd.com>
Date: Mon, 12 Jan 2026 11:40:04 +0100
From: Bernd Schubert <bernd@...ernd.com>
To: NeilBrown <neil@...wn.name>
Cc: Thorsten Leemhuis <linux@...mhuis.info>,
 Miklos Szeredi <miklos@...redi.hu>,
 Linux kernel regressions list <regressions@...ts.linux.dev>,
 LKML <linux-kernel@...r.kernel.org>,
 Linux-fsdevel <linux-fsdevel@...r.kernel.org>,
 Christian Brauner <brauner@...nel.org>
Subject: Re: [REGRESSION] fuse: xdg-document-portal gets stuck and causes
 suspend to fail in mainline



On 1/12/26 04:58, NeilBrown wrote:
> On Mon, 12 Jan 2026, Bernd Schubert wrote:
>>
>> On 1/11/26 12:37, Thorsten Leemhuis wrote:
>>> Lo! I can reliably get xdg-document-portal stuck on latest -mainline
>>> (and -next, too; 6.18.4. works fine) trough the Signal flatpak, which
>>> then causes suspend to fail:
>>>
>>> """
>>>> [  194.439381] PM: suspend entry (s2idle)
>>>> [  194.454708] Filesystems sync: 0.015 seconds
>>>> [  194.696767] Freezing user space processes
>>>> [  214.700978] Freezing user space processes failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
>>>> [  214.701143] task:xdg-document-po state:D stack:0     pid:2651  tgid:2651  ppid:1939   task_flags:0x400000 flags:0x00080002
>>>> [  214.701151] Call Trace:
>>>> [  214.701154]  <TASK>
>>>> [  214.701167]  __schedule+0x2b8/0x5e0
>>>> [  214.701181]  schedule+0x27/0x80
>>>> [  214.701188]  request_wait_answer+0xce/0x260 [fuse]
>>>> [  214.701202]  ? __pfx_autoremove_wake_function+0x10/0x10
>>>> [  214.701212]  __fuse_simple_request+0x120/0x340 [fuse]
>>>> [  214.701219]  fuse_lookup_name+0xc3/0x210 [fuse]
>>>> [  214.701235]  fuse_lookup+0x99/0x1c0 [fuse]
>>>> [  214.701242]  ? srso_alias_return_thunk+0x5/0xfbef5
>>>> [  214.701247]  ? fuse_dentry_init+0x23/0x50 [fuse]
>>>> [  214.701257]  lookup_one_qstr_excl+0xa8/0xf0
>>
>> Introduced by c9ba789dad15 ("VFS: introduce start_creating_noperm() and
>> start_removing_noperm()")?
>>
>> Why is the new code doing a lookup on an entry that is about to be
>> invalidated?
>>
>>
>> In order to handle this at least one fuse server process needs to be
>> available, but for this specific case the lookup still doesn't make sense.
>>
>> We could do something like this
>>
>> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
>> index 4b6b3d2758ff..7edbace7eddc 100644
>> --- a/fs/fuse/dir.c
>> +++ b/fs/fuse/dir.c
>> @@ -1599,6 +1599,15 @@ int fuse_reverse_inval_entry(struct fuse_conn
>> *fc, u64 parent_nodeid,
>>         if (!dir)
>>                 goto put_parent;
>>
>> +       /* Check dcache first - if not cached, nothing to invalidate */
>> +       name->hash = full_name_hash(dir, name->name, name->len);
>> +       entry = d_lookup(dir, name);
>> +       if (!entry) {
>> +               err = 0;
>> +               dput(dir);
>> +               goto put_parent;
>> +       }
>> +
>>         entry = start_removing_noperm(dir, name);
>>         dput(dir);
>>         if (IS_ERR(entry))
>>
>>
>> But let's assume the dentry exists - start_removing_noperm() will now
>> trigger a revalidate and get the same issue. From my point of view the
>> above commit should be reverted for fuse.
>>
>>
>>>> [  214.701264]  start_removing_noperm+0x59/0x80
>>>> [  214.701268]  ? d_find_alias+0x82/0xd0
>>>> [  214.701273]  fuse_reverse_inval_entry+0x7d/0x1f0 [fuse]
>>>> [  214.701280]  ? fuse_copy_do+0x5f/0xa0 [fuse]
>>>> [  214.701287]  fuse_notify+0x4a1/0x750 [fuse]
>>>> [  214.701295]  ? iov_iter_get_pages2+0x1d/0x40
>>>> [  214.701301]  ? srso_alias_return_thunk+0x5/0xfbef5
>>>> [  214.701305]  fuse_dev_do_write+0x2e4/0x440 [fuse]
>>>> [  214.701313]  fuse_dev_write+0x6b/0xa0 [fuse]
>>>> [  214.701320]  do_iter_readv_writev+0x161/0x260
>>>> [  214.701327]  vfs_writev+0x168/0x3c0
>>>> [  214.701334]  ? ksys_write+0xcd/0xf0
>>>> [  214.701338]  ? do_writev+0x7f/0x110
>>>> [  214.701341]  do_writev+0x7f/0x110
>>>> [  214.701344]  do_syscall_64+0x7e/0x6b0
>>>> [  214.701350]  ? srso_alias_return_thunk+0x5/0xfbef5
>>>> [  214.701352]  ? __handle_mm_fault+0x445/0x690
>>>> [  214.701359]  ? srso_alias_return_thunk+0x5/0xfbef5
>>>> [  214.701363]  ? srso_alias_return_thunk+0x5/0xfbef5
>>>> [  214.701365]  ? count_memcg_events+0xd6/0x210
>>>> [  214.701371]  ? srso_alias_return_thunk+0x5/0xfbef5
>>>> [  214.701373]  ? handle_mm_fault+0x212/0x340
>>>> [  214.701377]  ? srso_alias_return_thunk+0x5/0xfbef5
>>>> [  214.701379]  ? do_user_addr_fault+0x2b4/0x7b0
>>>> [  214.701387]  ? srso_alias_return_thunk+0x5/0xfbef5
>>>> [  214.701389]  ? irqentry_exit+0x6d/0x540
>>>> [  214.701393]  ? srso_alias_return_thunk+0x5/0xfbef5
>>>> [  214.701395]  ? exc_page_fault+0x7e/0x1a0
>>>> [  214.701398]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>> [  214.701402] RIP: 0033:0x7f3c144f9982
>>>> [  214.701467] RSP: 002b:00007fff80e2f388 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
>>>> [  214.701470] RAX: ffffffffffffffda RBX: 00007f3bec000cf0 RCX: 00007f3c144f9982
>>>> [  214.701472] RDX: 0000000000000003 RSI: 00007fff80e2f460 RDI: 0000000000000007
>>>> [  214.701474] RBP: 00007fff80e2f3b0 R08: 0000000000000000 R09: 0000000000000000
>>>> [  214.701475] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>>>> [  214.701477] R13: 00007f3bec000cf0 R14: 00007f3c14bb8280 R15: 00007f3be8001200
>>>> [  214.701481]  </TASK>
>>> """
>>>
>>> Killing the mentioned process using "kill -9" doesn't help. I can
>>> reliably trigger this in -mainline and -next using the Signal flatpak on
>>> Fedora 43 by trying to send a picture (which gets xdg-document-portal
>>> involved). It works the first time, but trying again won't and will
>>> cause Signal to get stuck for a few seconds. Works fine in 6.18.4.
>>>
>>> Is this maybe known already or does anybody have an idea what's wrong?
>>> If not I guess I'll have to bisect this.
>>>
>>> Ciao, Thorsten
>>>
>>> #regzbot introduced: v6.18..
>>> #regzbot title: fuse: xdg-document-portal gets stuck and causes suspend
>>> to fail
>>>
>>>
>>
>> Thanks,
>> Bernd
>>
> 
> I post a fix
> 
>   https://lore.kernel.org/all/176454037897.634289.3566631742434963788@noble.neil.brown.name/
> 
> a while ago.  There was some talk in that thread of reverting the
> breaking change instead.  I seems nothing happened.
> 
> Christian: should I resend my patch?

This didn't go to linux-fsdevel and I'm not subscribed to the other
lists. Might be the same for others. The patch looks good to me

Reviewed-by: Bernd Schubert <bschubert@....com>



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ