lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 13 Sep 2022 09:55:57 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Christoph Hellwig <hch@....de>
Cc:     Dusty Mabe <dusty@...tymabe.com>, Jens Axboe <axboe@...nel.dk>,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-raid@...r.kernel.org, ming.lei@...hat.com
Subject: Re: regression caused by block: freeze the queue earlier in
 del_gendisk

On Mon, Sep 12, 2022 at 09:16:18AM +0200, Christoph Hellwig wrote:
> On Fri, Sep 09, 2022 at 04:24:40PM +0800, Ming Lei wrote:
> > On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
> > > On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
> > > > It is a bit hard to associate the above commit with reported issue.
> > > 
> > > So the messages clearly are about something trying to open a device
> > > that went away at the block layer, but somehow does not get removed
> > > in time by udev (which seems to be a userspace bug in CoreOS).  But
> > > even with that we really should not hang.
> > 
> > Xiao Ni provides one script[1] which can reproduce the issue more or less.
> 
> I've run the reproduced 10000 times on current mainline, and while
> it prints one of the autoloading messages per run, I've not actually
> seen any kind of hang.

I can't reproduce the hang too.

What I meant is that new raid disk can be added by mdadm after stopping
the imsm container and raid disk with the autoloading messages printed,
I understand this behavior isn't correct, but I am not familiar with
raid enough.

It might be related with the delay deleting gendisk from wq & md kobj
release handler.

During reboot, if mdadm does this stupid thing without stopping, the hang
could be caused.

I think the root cause is that why mdadm tries to open/add new raid bdev
crazily during reboot.


Thanks, 
Ming

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ