linux-ext4 - Re: [ext4 io hang] buffered write io hang in balance_dirty

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:   Fri, 5 May 2023 10:06:28 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Keith Busch <kbusch@...nel.org>
Cc:     Theodore Ts'o <tytso@....edu>, linux-ext4@...r.kernel.org,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        linux-block@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        Dave Chinner <dchinner@...hat.com>,
        Eric Sandeen <sandeen@...hat.com>,
        Christoph Hellwig <hch@....de>, Zhang Yi <yi.zhang@...hat.com>,
        ming.lei@...hat.com
Subject: Re: [ext4 io hang] buffered write io hang in balance_dirty_pages

On Thu, May 04, 2023 at 09:59:52AM -0600, Keith Busch wrote:
> On Thu, Apr 27, 2023 at 10:20:28AM +0800, Ming Lei wrote:
> > Hello Guys,
> > 
> > I got one report in which buffered write IO hangs in balance_dirty_pages,
> > after one nvme block device is unplugged physically, then umount can't
> > succeed.
> > 
> > Turns out it is one long-term issue, and it can be triggered at least
> > since v5.14 until the latest v6.3.
> > 
> > And the issue can be reproduced reliably in KVM guest:
> > 
> > 1) run the following script inside guest:
> > 
> > mkfs.ext4 -F /dev/nvme0n1
> > mount /dev/nvme0n1 /mnt
> > dd if=/dev/zero of=/mnt/z.img&
> > sleep 10
> > echo 1 > /sys/block/nvme0n1/device/device/remove
> > 
> > 2) dd hang is observed and /dev/nvme0n1 is gone actually
> 
> Sorry to jump in so late.
> 
> For an ungraceful nvme removal, like a surpirse hot unplug, the driver
> sets the capacity to 0 and that effectively ends all dirty page writers
> that could stall forward progress on the removal. And that 0 capacity
> should also cause 'dd' to exit.

Actually nvme device has been gone, and the hang just happens in
balance_dirty_pages() from generic_perform_write().

The issue should be triggered on all kinds of disks which can be hot-unplug,
and it can be duplicated on both ublk and nvme easily.

> 
> But this is not an ungraceful removal, so we're not getting that forced
> behavior. Could we use the same capacity trick here after flushing any
> outstanding dirty pages?

set_capacity(0) has been called in del_gendisk() after fsync_bdev() &
__invalidate_device(), but I understand FS code just try best to flush dirty
pages. And when the bdev is gone, these un-flushed dirty pages need cleanup,
otherwise they can't be used any more.


Thanks,
Ming