linux-kernel - deadlock in wbt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <CAATkVEwKyUHWktL5PZ7Dqry_DQk9XSXzFg0s24XcUT2ftm=ZSA@mail.gmail.com>
Date:   Wed, 15 Aug 2018 15:59:59 -0400
From:   Debabrata Banerjee <dbavatar@...il.com>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     linux-kernel@...r.kernel.org, linux-block@...r.kernel.org
Subject: deadlock in wbt_wait()

I believe I've found a problem with wbt code, appears like when
switching elevators any blk requests that got throttled never wake up
after the change. You can easily reproduce this by running some dd
writers, and then switching between noop and cfq repeatedly. You
should get a hung dd task with a stack similar to what's below.
Attempting a patch to wake up waiters during a change, but nothing
working yet. Confused by why we're calling wbt_disable_default(q) in
cfq/bfq elevators only, as opposed to something generically from
elevator_switch() (looking at 4.14.59).

[<ffffffff82095632>] io_schedule+0x12/0x40
[<ffffffff823a7b47>] wbt_wait+0x1a7/0x360
[<ffffffff82374c49>] blk_queue_bio+0xf9/0x3e0
[<ffffffff82373050>] generic_make_request+0x100/0x280
[<ffffffff8237323c>] submit_bio+0x6c/0x140
[<ffffffffa01d8b88>] ext4_io_submit+0x48/0x60 [ext4]
[<ffffffffa01c098f>] ext4_writepages+0x68f/0xe40 [ext4]
[<ffffffff821782aa>] do_writepages+0x1a/0x60
[<ffffffff8216a1c7>] __filemap_fdatawrite_range+0xa7/0xe0
[<ffffffffa01af8e2>] ext4_release_file+0x72/0xc0 [ext4]
[<ffffffff821ee5e5>] __fput+0xa5/0x220
[<ffffffff820880a0>] task_work_run+0x80/0xa0
[<ffffffff820016e0>] exit_to_usermode_loop+0xb0/0xc0
[<ffffffff82001d24>] do_syscall_64+0x104/0x120
[<ffffffff82800081>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<ffffffffffffffff>] 0xffffffffffffffff

Actually if I run this test enough times sometimes I get a panic, I
assume that's due to some disk completion arriving in the wrong place,
maybe not related to wbt.

[  804.546000] RIP: 0010:run_timer_softirq+0xf2/0x1d0
[  804.551163] RSP: 0018:ffff88105f443f00 EFLAGS: 00010002
[  804.556753] RAX: 00000001003e0002 RBX: ffff88085782de90 RCX: ffff88085782de90
[  804.564269] RDX: ffff88105f443f00 RSI: ffff88105f4596a8 RDI: ffff88105f443f08
[  804.571781] RBP: 0000000000000000 R08: ffff88105f459958 R09: ffff88105f443f08
[  804.579297] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88105f459680
[  804.586819] R13: ffff88105f443f00 R14: 0000000000000000 R15: ffff88105f4596f0
[  804.594314] FS:  0000000000000000(0000) GS:ffff88105f440000(0000)
knlGS:0000000000000000
[  804.603102] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  804.609196] CR2: 00000001003e000a CR3: 000000000300a001 CR4: 00000000001606e0
[  804.616684] Call Trace:
[  804.619520]  <IRQ>
[  804.621913]  ? timerqueue_add+0x54/0x80
[  804.626105]  ? enqueue_hrtimer+0x38/0x90
[  804.630379]  __do_softirq+0xf1/0x296
[  804.634323]  irq_exit+0x76/0x80
[  804.637830]  smp_apic_timer_interrupt+0x70/0x130
[  804.642827]  apic_timer_interrupt+0x7d/0x90
[  804.647379]  </IRQ>