[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20221103013937.603626-1-khazhy@google.com>
Date: Wed, 2 Nov 2022 18:39:37 -0700
From: Khazhismel Kumykov <khazhy@...omium.org>
To: Paolo Valente <paolo.valente@...aro.org>,
Jens Axboe <axboe@...nel.dk>
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
Khazhismel Kumykov <khazhy@...gle.com>
Subject: [RFC PATCH] bfq: fix waker_bfqq inconsistency crash
This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL,
but woken_list_node still being hashed. This would happen when
bfq_init_rq() expects a brand new allocated queue to be returned from
bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq
without resetting woken_list_node. Since we can always return oom_bfqq
when attempting to allocate, we cannot assume waker_bfqq starts as NULL.
We must either reset woken_list_node, or avoid setting woken_list at all
for oom_bfqq - opt to do the former.
Crashes would have a stacktrace like:
[160595.656560] bfq_add_bfqq_busy+0x110/0x1ec
[160595.661142] bfq_add_request+0x6bc/0x980
[160595.666602] bfq_insert_request+0x8ec/0x1240
[160595.671762] bfq_insert_requests+0x58/0x9c
[160595.676420] blk_mq_sched_insert_request+0x11c/0x198
[160595.682107] blk_mq_submit_bio+0x270/0x62c
[160595.686759] __submit_bio_noacct_mq+0xec/0x178
[160595.691926] submit_bio+0x120/0x184
[160595.695990] ext4_mpage_readpages+0x77c/0x7c8
[160595.701026] ext4_readpage+0x60/0xb0
[160595.705158] filemap_read_page+0x54/0x114
[160595.711961] filemap_fault+0x228/0x5f4
[160595.716272] do_read_fault+0xe0/0x1f0
[160595.720487] do_fault+0x40/0x1c8
Tested by injecting random failures into bfq_get_queue, crashes go away
completely.
Fixes: 8ef3fc3a043c ("block, bfq: make shared queues inherit wakers")
Signed-off-by: Khazhismel Kumykov <khazhy@...gle.com>
---
RFC mainly because it's not clear to me the best policy here - but the
patch is tested and fixes a real crash we started seeing in 5.15
This is following up my ramble over at
https://lore.kernel.org/lkml/CACGdZYLMnfcqwbAXDx+x9vUOMn2cz55oc+8WySBS3J2Xd_q7Lg@mail.gmail.com/
block/bfq-iosched.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 7ea427817f7f..5d2861119d20 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -6793,7 +6793,12 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
* reset. So insert new_bfqq into the
* woken_list of the waker. See
* bfq_check_waker for details.
+ *
+ * Also, if we got oom_bfqq, we must check if
+ * it's already in a woken_list
*/
+ if (unlikely(!hlist_unhashed(&bfqq->woken_list_node)))
+ hlist_del_init(&bfqq->woken_list_node);
if (bfqq->waker_bfqq)
hlist_add_head(&bfqq->woken_list_node,
&bfqq->waker_bfqq->woken_list);
--
2.38.1.273.g43a17bfeac-goog
Powered by blists - more mailing lists