lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4af605b4-d4c9-0060-9a26-f9846d44a328@leemhuis.info>
Date:   Tue, 8 Mar 2022 06:56:33 +0100
From:   Thorsten Leemhuis <regressions@...mhuis.info>
To:     Christoph Hellwig <hch@....de>, Jens Axboe <axboe@...nel.dk>,
        Justin Sanders <justin@...aid.com>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-block@...r.kernel.org,
        Chaitanya Kulkarni <chaitanya.kulkarni@....com>
Subject: Bug 215647 - aoe: removing aoe devices with flush (implicit in rmmod
 aoe) leads to page fault

Hi! As part of my regression tracking work I noticed this bug report
that was filed about a week ago:

https://bugzilla.kernel.org/show_bug.cgi?id=215647

To quote the first para:

> there is a bug in the aoe driver module between v4.20-rc1 and
> v5.14-rc1 inroduced in 3582dd2 (aoe: convert aoeblk to blk-mq) and
> fixed in 6560ec9 (aoe: use blk_mq_alloc_disk and blk_cleanup_disk). 
> Every forcible removal of an aoe device (eg. "rmmod aoe" with aoe
> devices available or "aoe-flush ex.x") leads to a page fault. This
> bug was successfully reproduced with kernel 5.10.92 from the debian
> repository, there were no changes to the affected code between
> v4.20-rc1 and v5.14-rc1. Version 4.19.208 (from debian buster) and
> 5.17-rc4 (from debian experimental) are confirmed not to be
> affected.

I checked the logs to see why mainline might not be affected anymore and
noticed a recent commit in the same area:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/block/aoe/aoedev.c?id=6560ec961a080944f8d5e1fef17b771bfaf189cb

> From 6560ec961a080944f8d5e1fef17b771bfaf189cb Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@....de>
> Date: Wed, 2 Jun 2021 09:53:31 +0300
> Subject: aoe: use blk_mq_alloc_disk and blk_cleanup_disk
> 
> Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
> request_queue allocation.
> 
> Signed-off-by: Christoph Hellwig <hch@....de>
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@....com>
> Link: https://lore.kernel.org/r/20210602065345.355274-17-hch@lst.de
> Signed-off-by: Jens Axboe <axboe@...nel.dk>
> ---
>  drivers/block/aoe/aoedev.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> (limited to 'drivers/block/aoe/aoedev.c')
> 
> diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
> index e2ea2356da061..c5753c6bfe804 100644
> --- a/drivers/block/aoe/aoedev.c
> +++ b/drivers/block/aoe/aoedev.c
> @@ -277,9 +277,8 @@ freedev(struct aoedev *d)
>  	if (d->gd) {
>  		aoedisk_rm_debugfs(d);
>  		del_gendisk(d->gd);
> -		put_disk(d->gd);
> +		blk_cleanup_disk(d->gd);
>  		blk_mq_free_tag_set(&d->tag_set);
> -		blk_cleanup_queue(d->blkq);
>  	}
>  	t = d->targets;
>  	e = t + d->ntargets;

Does that need backporting? Or is the patch the reporter provided in
bugzilla the easier and safer way to fix that regression in older releases?

Ciao, Thorsten

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ