[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <09659231-5e0e-2767-d180-4285fb6e12b3@sandisk.com>
Date: Tue, 9 Aug 2016 16:10:02 -0700
From: Bart Van Assche <bart.vanassche@...disk.com>
To: Oleg Nesterov <oleg@...hat.com>
CC: Peter Zijlstra <peterz@...radead.org>,
"mingo@...nel.org" <mingo@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
"Johannes Weiner" <hannes@...xchg.org>, Neil Brown <neilb@...e.de>,
Michael Shaver <jmshaver@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
On 08/09/2016 11:48 AM, Bart Van Assche wrote:
> [ 1548.018115] sysrq: SysRq : Show Blocked State
> [ 1548.018210] task PC stack pid father
> [ 1548.018677] systemd-udevd D ffff8803a9f13be8 0 29908 483 0x00000000
> [ 1548.018792] ffff8803a9f13be8 ffffffff82584bd0 00ffffff8252b1b0 ffff88046f0569c0
> [ 1548.018961] ffff88016c98b140 ffff8800757bc9c0 ffff8803a9f14000 ffff88046f0569c0
> [ 1548.019131] 7fffffffffffffff ffffffff8161fcf0 ffff8803a9f13d50 ffff8803a9f13c00
> [ 1548.019316] Call Trace:
> [ 1548.019415] [<ffffffff8161f567>] schedule+0x37/0x90
> [ 1548.019464] [<ffffffff81623bbf>] schedule_timeout+0x27f/0x470
> [ 1548.019758] [<ffffffff8161e93f>] io_schedule_timeout+0x9f/0x110
> [ 1548.019808] [<ffffffff8161fd06>] bit_wait_io+0x16/0x60
> [ 1548.019856] [<ffffffff8161f996>] __wait_on_bit+0x56/0x80
> [ 1548.019906] [<ffffffff81152e1d>] wait_on_page_bit_killable+0xbd/0xc0
> [ 1548.020006] [<ffffffff81152f50>] generic_file_read_iter+0x130/0x770
> [ 1548.020158] [<ffffffff812134a0>] blkdev_read_iter+0x30/0x40
> [ 1548.020209] [<ffffffff811d266b>] __vfs_read+0xbb/0x130
> [ 1548.020258] [<ffffffff811d2a51>] vfs_read+0x91/0x130
> [ 1548.020305] [<ffffffff811d3dd4>] SyS_read+0x44/0xa0
> [ 1548.020354] [<ffffffff81624fa5>] entry_SYSCALL_64_fastpath+0x18/0xa8
(replying to my own e-mail)
The above call stack is probably caused by a missing I/O completion
somewhere in the I/O stack (not in ib_srp) and hence can be ignored in
the context of the discussion about __wait_on_bit_lock(). BTW, I have
made the following local change in abort_exclusive_wait() in the hope
that if I can trigger this statement that it will provide more
information about why the __wait_on_bit_lock() hang happens:
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index f0fdd8e..fad852d 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -280,6 +280,8 @@ void abort_exclusive_wait(wait_queue_head_t *q,
wait_queue_t *wait,
__set_current_state(TASK_RUNNING);
spin_lock_irqsave(&q->lock, flags);
+ WARN_ONCE(!list_empty(&wait->task_list) && waitqueue_active(q),
+ "mode = %#x\n", mode);
if (!list_empty(&wait->task_list))
list_del_init(&wait->task_list);
else if (waitqueue_active(q))
Bart.
Powered by blists - more mailing lists