[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4af605b4-d4c9-0060-9a26-f9846d44a328@leemhuis.info>
Date: Tue, 8 Mar 2022 06:56:33 +0100
From: Thorsten Leemhuis <regressions@...mhuis.info>
To: Christoph Hellwig <hch@....de>, Jens Axboe <axboe@...nel.dk>,
Justin Sanders <justin@...aid.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-block@...r.kernel.org,
Chaitanya Kulkarni <chaitanya.kulkarni@....com>
Subject: Bug 215647 - aoe: removing aoe devices with flush (implicit in rmmod
aoe) leads to page fault
Hi! As part of my regression tracking work I noticed this bug report
that was filed about a week ago:
https://bugzilla.kernel.org/show_bug.cgi?id=215647
To quote the first para:
> there is a bug in the aoe driver module between v4.20-rc1 and
> v5.14-rc1 inroduced in 3582dd2 (aoe: convert aoeblk to blk-mq) and
> fixed in 6560ec9 (aoe: use blk_mq_alloc_disk and blk_cleanup_disk).
> Every forcible removal of an aoe device (eg. "rmmod aoe" with aoe
> devices available or "aoe-flush ex.x") leads to a page fault. This
> bug was successfully reproduced with kernel 5.10.92 from the debian
> repository, there were no changes to the affected code between
> v4.20-rc1 and v5.14-rc1. Version 4.19.208 (from debian buster) and
> 5.17-rc4 (from debian experimental) are confirmed not to be
> affected.
I checked the logs to see why mainline might not be affected anymore and
noticed a recent commit in the same area:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/block/aoe/aoedev.c?id=6560ec961a080944f8d5e1fef17b771bfaf189cb
> From 6560ec961a080944f8d5e1fef17b771bfaf189cb Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@....de>
> Date: Wed, 2 Jun 2021 09:53:31 +0300
> Subject: aoe: use blk_mq_alloc_disk and blk_cleanup_disk
>
> Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
> request_queue allocation.
>
> Signed-off-by: Christoph Hellwig <hch@....de>
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@....com>
> Link: https://lore.kernel.org/r/20210602065345.355274-17-hch@lst.de
> Signed-off-by: Jens Axboe <axboe@...nel.dk>
> ---
> drivers/block/aoe/aoedev.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> (limited to 'drivers/block/aoe/aoedev.c')
>
> diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
> index e2ea2356da061..c5753c6bfe804 100644
> --- a/drivers/block/aoe/aoedev.c
> +++ b/drivers/block/aoe/aoedev.c
> @@ -277,9 +277,8 @@ freedev(struct aoedev *d)
> if (d->gd) {
> aoedisk_rm_debugfs(d);
> del_gendisk(d->gd);
> - put_disk(d->gd);
> + blk_cleanup_disk(d->gd);
> blk_mq_free_tag_set(&d->tag_set);
> - blk_cleanup_queue(d->blkq);
> }
> t = d->targets;
> e = t + d->ntargets;
Does that need backporting? Or is the patch the reporter provided in
bugzilla the easier and safer way to fix that regression in older releases?
Ciao, Thorsten
Powered by blists - more mailing lists