lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZJIuG9hyTYIIDbF5@google.com>
Date:   Tue, 20 Jun 2023 22:54:19 +0000
From:   Edward Liaw <edliaw@...gle.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     linux-kernel@...r.kernel.org, kernel test robot <lkp@...el.com>,
        Suwan Kim <suwan.kim027@...il.com>,
        "Roberts, Martin" <martin.roberts@...el.com>,
        Jason Wang <jasowang@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Stefan Hajnoczi <stefanha@...hat.com>,
        Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
        Jens Axboe <axboe@...nel.dk>,
        virtualization@...ts.linux-foundation.org,
        linux-block@...r.kernel.org
Subject: Re: [PATCH v2] Revert "virtio-blk: support completion batching for
 the IRQ path"

On Fri, Jun 09, 2023 at 03:27:24AM -0400, Michael S. Tsirkin wrote:
> This reverts commit 07b679f70d73483930e8d3c293942416d9cd5c13.
This commit was also breaking kernel tests on a virtual Android device
(cuttlefish).  We were seeing hangups like:

[ 2889.910733] INFO: task kworker/u8:2:6312 blocked for more than 720 seconds.
[ 2889.910967]       Tainted: G           OE      6.2.0-mainline-g5c05cafa8df7-ab9969617 #1
[ 2889.911143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2889.911389] task:kworker/u8:2    state:D stack:12160 pid:6312  ppid:2      flags:0x00004000
[ 2889.911567] Workqueue: writeback wb_workfn (flush-254:57)
[ 2889.911771] Call Trace:
[ 2889.911831]  <TASK>
[ 2889.911893]  __schedule+0x55f/0x880
[ 2889.912021]  schedule+0x6a/0xc0
[ 2889.912110]  schedule_timeout+0x58/0x1a0
[ 2889.912200]  wait_for_common+0xf7/0x1b0
[ 2889.912289]  wait_for_completion+0x1c/0x40
[ 2889.912377]  f2fs_issue_checkpoint+0x14c/0x210
[ 2889.912504]  f2fs_sync_fs+0x4c/0xb0
[ 2889.912597]  f2fs_balance_fs_bg+0x2f6/0x340
[ 2889.912736]  ? can_migrate_task+0x39/0x2b0
[ 2889.912872]  f2fs_write_node_pages+0x77/0x240
[ 2889.912989]  do_writepages+0xde/0x240
[ 2889.913094]  __writeback_single_inode+0x3f/0x360
[ 2889.913231]  writeback_sb_inodes+0x320/0x5f0
[ 2889.913349]  ? move_expired_inodes+0x58/0x210
[ 2889.913470]  __writeback_inodes_wb+0x97/0x100
[ 2889.913587]  wb_writeback+0x17e/0x390
[ 2889.913682]  wb_workfn+0x35f/0x500
[ 2889.913774]  process_one_work+0x1d9/0x3b0
[ 2889.913870]  worker_thread+0x251/0x410
[ 2889.913960]  kthread+0xeb/0x110
[ 2889.914052]  ? __cfi_worker_thread+0x10/0x10
[ 2889.914168]  ? __cfi_kthread+0x10/0x10
[ 2889.914257]  ret_from_fork+0x29/0x50
[ 2889.914364]  </TASK>
[ 2889.914565] INFO: task mkdir09:6425 blocked for more than 720 seconds.
[ 2889.916065]       Tainted: G           OE      6.2.0-mainline-g5c05cafa8df7-ab9969617 #1
[ 2889.916255] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2889.916450] task:mkdir09         state:D stack:13016 pid:6425  ppid:6423   flags:0x00004000
[ 2889.916656] Call Trace:
[ 2889.916900]  <TASK>
[ 2889.917004]  __schedule+0x55f/0x880
[ 2889.917129]  schedule+0x6a/0xc0
[ 2889.917273]  schedule_timeout+0x58/0x1a0
[ 2889.917425]  wait_for_common+0xf7/0x1b0
[ 2889.917535]  wait_for_completion+0x1c/0x40
[ 2889.917670]  f2fs_issue_checkpoint+0x14c/0x210
[ 2889.917844]  f2fs_sync_fs+0x4c/0xb0
[ 2889.917969]  f2fs_do_sync_file+0x3a8/0x8c0
[ 2889.918090]  ? mt_find+0xa0/0x1a0
[ 2889.918216]  f2fs_sync_file+0x2f/0x60
[ 2889.918310]  vfs_fsync_range+0x74/0x90
[ 2889.918567]  __se_sys_msync+0x176/0x270
[ 2889.918730]  __x64_sys_msync+0x1c/0x40
[ 2889.918873]  do_syscall_64+0x53/0xb0
[ 2889.918996]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 2889.919178] RIP: 0033:0x7540b08bcf47
[ 2889.919297] RSP: 002b:00007fff5fcbeea8 EFLAGS: 00000206 ORIG_RAX: 000000000000001a
[ 2889.919496] RAX: ffffffffffffffda RBX: 0000000000001000 RCX: 00007540b08bcf47
[ 2889.919828] RDX: 0000000000000004 RSI: 0000000000001000 RDI: 00007540b4ae7000
[ 2889.920227] RBP: 0000000000000000 R08: 721e0000010b0016 R09: 0000000000000003
[ 2889.920435] R10: 0000000000000100 R11: 0000000000000206 R12: 00005d2f95fd2f08
[ 2889.920793] R13: 00005d2f95fbc310 R14: 00007540b08e1bb8 R15: 00005d2f95fbc310
[ 2889.921072]  </TASK>

in random tests in the LTP (linux test project) test suite.

> ---
> 
> Since v1:
> 	fix build error
> 
> Still completely untested as I'm traveling.
> Martin, Suwan, could you please test and report?
> Suwan if you have a better revert in mind pls post and
> I will be happy to drop this.
> 
> Thanks!
> 
This revert appears to have resolved the test failures for me.

Tested-by: edliaw@...gle.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ