lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <490be4e0b984e146c93586507442de3dad8694bb.camel@mediatek.com>
Date:   Mon, 13 Jun 2022 17:29:06 +0800
From:   Ed Tsai <ed.tsai@...iatek.com>
To:     Miklos Szeredi <miklos@...redi.hu>
CC:     "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        chenguanyou <chenguanyou9338@...il.com>,
        Stanley Chu (朱原陞) 
        <stanley.chu@...iatek.com>,
        Yong-xuan Wang (王詠萱) 
        <Yong-xuan.Wang@...iatek.com>
Subject: Re: [PATCH] [fuse] alloc_page nofs avoid deadlock

On Mon, 2022-06-13 at 16:45 +0800, Miklos Szeredi wrote:
> On Fri, 10 Jun 2022 at 09:48, Ed Tsai <ed.tsai@...iatek.com> wrote:
> 
> > Recently, we get this deadlock issue again.
> > fuse_flush_time_update()
> > use sync_inode_metadata() and it only write the metadata, so the
> > writeback worker could still be blocked becaused of file data.
> > 
> > I try to use write_inode_now() instead of sync_inode_metadata() and
> > the
> > writeback thread will not be blocked anymore. I don't think this is
> > a
> > good solution, but this confirm that there is still a potential
> > deadlock because of file data. WDYT.
> 
> I'm not sure how that happens.  Normally writeback doesn't
> block.  Can
> you provide the stack traces of all related tasks in the deadlock?
> 
> Thanks,
> Miklos

The writeback worker
ppid=22915 pid=22915 S cpu=6 prio=120 wait=3614s kworker/u16:21
vmlinux  request_wait_answer + 64
vmlinux  __fuse_request_send + 328
vmlinux  fuse_request_send + 60
vmlinux  fuse_simple_request + 376
vmlinux  fuse_flush_times + 276
vmlinux  fuse_write_inode + 104 (inode=0xFFFFFFD516CC4780, ff=0)
vmlinux  write_inode + 384
vmlinux  __writeback_single_inode + 960
vmlinux  writeback_sb_inodes + 892
vmlinux  __writeback_inodes_wb + 156
vmlinux  wb_writeback + 512
vmlinux  wb_check_background_flush + 600
vmlinux  wb_do_writeback + 644
vmlinux  wb_workfn + 756
vmlinux  process_one_work + 628
vmlinux  worker_thread + 708
vmlinux  kthread + 376
vmlinux  ret_from_fork + 16

Thread-11
ppid=3961 pid=26057 D cpu=4 prio=120 wait=3614s Thread-11
vmlinux  __inode_wait_for_writeback + 108
vmlinux  inode_wait_for_writeback + 156
vmlinux  evict + 160
vmlinux  iput_final + 292
vmlinux  iput + 600
vmlinux  dentry_unlink_inode + 212
vmlinux  __dentry_kill + 228
vmlinux  shrink_dentry_list + 408
vmlinux  prune_dcache_sb + 80
vmlinux  super_cache_scan + 272
vmlinux  do_shrink_slab + 944
vmlinux  shrink_slab + 1104
vmlinux  shrink_node + 712
vmlinux  shrink_zones + 188
vmlinux  do_try_to_free_pages + 348
vmlinux  try_to_free_pages + 848
vmlinux  __perform_reclaim + 64
vmlinux  __alloc_pages_direct_reclaim + 64
vmlinux  __alloc_pages_slowpath + 1296
vmlinux  __alloc_pages_nodemask + 2004
vmlinux  __alloc_pages + 16
vmlinux  __alloc_pages_node + 16
vmlinux  alloc_pages_node + 16
vmlinux  __read_swap_cache_async + 172
vmlinux  read_swap_cache_async + 12
vmlinux  swapin_readahead + 328
vmlinux  do_swap_page + 844
vmlinux  handle_pte_fault + 268
vmlinux  __handle_speculative_fault + 548
vmlinux  handle_speculative_fault + 44
vmlinux  do_page_fault + 500
vmlinux  do_translation_fault + 64
vmlinux  do_mem_abort + 72
vmlinux  el0_sync + 1032

ppid=3961 is com.google.android.providers.media.module, and it is the
android fuse daemon.

So, the daemon and wb worker were wait for each other.

Best,
Ed Tsai

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ