lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120914125312.GA20973@localhost>
Date:	Fri, 14 Sep 2012 20:53:12 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
Cc:	viro@...iv.linux.org.uk, jack@...e.cz, hch@....de,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Fix queueing work if !bdi_cap_writeback_dirty()

On Fri, Sep 14, 2012 at 09:12:02PM +0900, OGAWA Hirofumi wrote:
> Fengguang Wu <fengguang.wu@...el.com> writes:
> 
> >> >> @@ -120,6 +120,9 @@ __bdi_start_writeback(struct backing_dev
> >> >>  {
> >> >>  	struct wb_writeback_work *work;
> >> >>  
> >> >> +	if (!bdi_cap_writeback_dirty(bdi))
> >> >> +		return;
> >> >
> >> > Will someone in the current kernel actually call
> >> > __bdi_start_writeback() on a BDI_CAP_NO_WRITEBACK bdi?
> >> >
> >> > If the answer is no, VM_BUG_ON(!bdi_cap_writeback_dirty(bdi)) looks better.
> >> 
> >> I guess nobody call it in current kernel though. Hmm.., but we also have
> >> check in __mark_inode_dirty(), nobody should be using it, right?
> >> 
> >> If we defined it as the bug, I can't see what BDI_CAP_NO_WRITEBACK wants
> >> to do actually.  We are not going to allow to disable the writeback task?
> >
> >> I was going to use this to disable writeback task on my developing FS...
> >
> > That sounds like an interesting use case. Can you elaborate a bit more?
> >
> > Note that even if you disable __bdi_start_writeback() here, the kernel
> > may also start writeback in the page reclaim path, the fsync() path,
> > and perhaps more.
> 
> page reclaim and fsync path have FS handler. So, FS can control those.
> 
> The modern FS have to control to flush carefully. Many FSes are already
> ignoring if wbc->sync_mode != WB_SYNC_ALL (e.g. ext3_write_inode,
> nilfs_writepages), and have own FS task to flush.

Yeah, that test is mainly to improve IO efficiency for
non-data-integrity writes.

> The writeback task is always called with sync_mode != WB_SYNC_ALL except
> sync_inodes_sb(). But FS has sb->s_op->sync_fs() handler for
> sync_inodes_sb() path. So, writeback task just bothers FS to control to
> flush.
> 
> Also it wants to control the reclaimable of inode cache too, because FS
> have to control to flush, and wants to use inode in own FS task, and it
> knows when inode is cleaned and can be reclaimed.
> 
> I thought there are 2 options - 1) pin inode with iget(), and iput() on
> own FS task, 2) disable writeback task and care about inode reclaim by
> dirty flags.
> 
> (1) was complex (e.g. inode can be the orphan inode), and seems to be
> ineffective workaround to survive with writeback task.

In principle, the VFS should of course give enough flexibility for the
FS. But it's all about the details that matter. As for the
BDI_CAP_NO_WRITEBACK approach, I'm afraid you'll not get the expected
"FS control" through it. Because the flusher thread may already have a
long queue of works which will take long time to finish. It even have
its internal background/periodic works that's not controllable this
way, see wb_check_background_flush().

And BDI_CAP_NO_WRITEBACK is expected to be a static/constant flag that
always evaluate to true/false for a given bdi.  There will be
correctness problems if you change the BDI_CAP_NO_WRITEBACK flag
dynamically.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ