[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZFB9WaRmy28zYWE4@ovpn-8-16.pek2.redhat.com>
Date: Tue, 2 May 2023 11:02:49 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Theodore Ts'o <tytso@....edu>
Cc: Baokun Li <libaokun1@...wei.com>,
Matthew Wilcox <willy@...radead.org>,
linux-ext4@...r.kernel.org,
Andreas Dilger <adilger.kernel@...ger.ca>,
linux-block@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
Dave Chinner <dchinner@...hat.com>,
Eric Sandeen <sandeen@...hat.com>,
Christoph Hellwig <hch@....de>, Zhang Yi <yi.zhang@...hat.com>,
yangerkun <yangerkun@...wei.com>, ming.lei@...hat.com
Subject: Re: [ext4 io hang] buffered write io hang in balance_dirty_pages
On Sat, Apr 29, 2023 at 12:56:03AM -0400, Theodore Ts'o wrote:
> On Sat, Apr 29, 2023 at 11:16:14AM +0800, Ming Lei wrote:
> >
> > bdi_unregister() is called in del_gendisk(), since bdi_register() has
> > to be called in add_disk() where major/minor is figured out.
> >
> > > problem is that the block device shouldn't just *vanish*, with the
> >
> > That looks not realistic, removable disk can be gone any time, and device
> > driver error handler often deletes disk as the last straw, and it shouldn't
> > be hard to observe such error.
>
> It's not realistic to think that the file system can write back any
> dirty pages, sure. At this point, the user has already yanked out the
> thumb drive, and the physical device is gone. However, various fields
> like bdi->dev shouldn't get deinitialized until after the
> s_ops->shutdown() function has returned.
Yeah, it makes sense.
>
> We need to give the file system a chance to shutdown any pending
> writebacks; otherwise, we could be racing with writeback happening in
> some other kernel thread, and while the I/O is certainly not going to
> suceed, it would be nice if attempts to write to the block device
> return an error, intead potentially causing the kernel to crash.
>
> The shutdown function might need to sleep while it waits for
> workqueues or kernel threads to exit, or while it iterates over all
> inodes and clears all of the dirty bits and/or drop all of the pages
> associated with the file system on the disconnected block device. So
> while this happens, I/O should just fail, and not result in a kernel
> BUG or oops.
>
> Once the s_ops->shutdown() has returned, then del_gendisk can shutdown
> and/or deallocate anything it wants, and if the file system tries to
> use the bdi after s_ops->shutdown() has returned, well, it deserves
> anything it gets.
>
> (Well, it would be nice if things didn't bug/oops in fs/buffer.c if
> there is no s_ops->shutdown() function, since there are a lot of
> legacy file systems that use the buffer cache and until we can add
> some kind of generic shutdown function to fs/libfs.c and make sure
One generic shutdown API is pretty nice, such as sb_force_shutdown() posted by Dave.
> that all of the legacy file systems that are likely to be used on a
> USB thumb drive are fixed, it would be nice if they were protected.
> At the very least, we should make that things are no worse than they
> currently are.)
Yeah, things won't be worse than now for legacy FS, given the generic
FS API could cover more generic FS cleanup, and block layer always calls
it before removing one disk.
Thanks,
Ming
Powered by blists - more mailing lists