linux-kernel - Re: [PATCH 10/11] writeback: splice dirty inode entries to default bdi on bdi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Thu, 17 Sep 2009 11:33:54 +0200
From:	Jan Kara <jack@...e.cz>
To:	Jens Axboe <jens.axboe@...cle.com>
Cc:	Jan Kara <jack@...e.cz>, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, chris.mason@...cle.com,
	hch@...radead.org, tytso@....edu, akpm@...ux-foundation.org,
	trond.myklebust@....uio.no
Subject: Re: [PATCH 10/11] writeback: splice dirty inode entries to default
	bdi on bdi_destroy()

On Wed 16-09-09 20:31:29, Jens Axboe wrote:
> On Wed, Sep 16 2009, Jan Kara wrote:
> > On Wed 16-09-09 15:21:08, Jens Axboe wrote:
> > > On Wed, Sep 16 2009, Jan Kara wrote:
> > > > On Tue 15-09-09 20:16:56, Jens Axboe wrote:
> > > > > We cannot safely ensure that the inodes are all gone at this point
> > > > > in time, and we must not destroy this bdi with inodes having off it.
> > > >                                                         ^^^ hanging
> > > > 
> > > > > So just splice our entries to the default bdi since that one will
> > > > > always persist.
> > > >   BTW: Why can't we make sure all inodes on the BDI are clean when we
> > > > destroy it? Common sence would suggest that we better should be able to do
> > > > it :).
> > > >   Maybe it's because most users of private BDI do not call bdi_unregister
> > > > but rather directly bdi_destroy? Is this correct behavior?
> > > Not sure yet, it's on the TODO. This basically works around the problem
> > > for now at least. With dm at least, I'm seeing inodes still hanging off
> > > the bdi after we have done a sync_blockdev(bdev, 1);.
> >   Do you really mean sync_blockdev() or fsync_bdev()? Because the first one
> > just synces the blockdev's mapping not the filesystem...
> 
> Do we want a fsync_bdev() in __blkdev_put()? It's only doing
  No, we cannot call fsync_bdev() there because nothing really guarantees
that there exists any filesystem on the device and that it is setup enough
to handle IO - __blkdev_put() is called e.g. after the filesystem has been
cleaned up in ->put_super(). You can have a look like code in
generic_shutdown_super() looks like. The function is called when user has
no chance of dirtying any more data. In particular sync_filesystem() call
there should write everything to disk. If it does not, it's a bug.
->put_super() can dirty some data again, but only buffers of underlying
blockdev (e.g. when writing bitmaps, superblock etc.). If ->put_super()
method of some filesystem leaves some inodes dirty, it's a bug - we'd see
"VFS: Busy inodes after unmount" message.

> sync_blockdev() on last close, and dm wants to tear down the device at
> that point. So either dm needs to really flush the device when going
> readonly, or we need to strengthen the 'flush on last close'.
  Yes, but at the time __blkdev_put() is called, there should be no dirty
inodes as I've argued above. So I still don't quite get how there could be
any :)

									Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/