lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110419211008.GD9556@quack.suse.cz>
Date:	Tue, 19 Apr 2011 23:10:08 +0200
From:	Jan Kara <jack@...e.cz>
To:	Wu Fengguang <fengguang.wu@...el.com>
Cc:	Jan Kara <jack@...e.cz>, Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mel@...ux.vnet.ibm.com>,
	Dave Chinner <david@...morbit.com>,
	Trond Myklebust <Trond.Myklebust@...app.com>,
	Itaru Kitayama <kitayama@...bb4u.ne.jp>,
	Minchan Kim <minchan.kim@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	Linux Memory Management List <linux-mm@...ck.org>
Subject: Re: [PATCH 5/6] writeback: try more writeback as long as something
 was written

On Tue 19-04-11 19:16:01, Wu Fengguang wrote:
> On Tue, Apr 19, 2011 at 06:20:16PM +0800, Jan Kara wrote:
> > On Tue 19-04-11 11:00:08, Wu Fengguang wrote:
> > > writeback_inodes_wb()/__writeback_inodes_sb() are not aggressive in that
> > > they only populate possibly a subset of elegible inodes into b_io at
> > > entrance time. When the queued set of inodes are all synced, they just
> > > return, possibly with all queued inode pages written but still
> > > wbc.nr_to_write > 0.
> > > 
> > > For kupdate and background writeback, there may be more eligible inodes
> > > sitting in b_dirty when the current set of b_io inodes are completed. So
> > > it is necessary to try another round of writeback as long as we made some
> > > progress in this round. When there are no more eligible inodes, no more
> > > inodes will be enqueued in queue_io(), hence nothing could/will be
> > > synced and we may safely bail.
> >   Let me understand your concern here: You are afraid that if we do
> > for_background or for_kupdate writeback and we write less than
> > MAX_WRITEBACK_PAGES, we stop doing writeback although there could be more
> > inodes to write at the time we are stopping writeback - the two realistic
> 
> Yes.
> 
> > cases I can think of are:
> > a) when inodes just freshly expired during writeback
> > b) when bdi has less than MAX_WRITEBACK_PAGES of dirty data but we are over
> >   background threshold due to data on some other bdi. And then while we are
> >   doing writeback someone does dirtying at our bdi.
> > Or do you see some other case as well?
> > 
> > The a) case does not seem like a big issue to me after your changes to
> 
> Yeah (a) is not an issue with kupdate writeback.
> 
> > move_expired_inodes(). The b) case maybe but do you think it will make any
> > difference? 
> 
> (b) seems also weird. What in my mind is this for_background case.
> Imagine 100 inodes
> 
>         i0, i1, i2, ..., i90, i91, i99
> 
> At queue_io() time, i90-i99 happen to be expired and moved to s_io for
> IO. When finished successfully, if their total size is less than
> MAX_WRITEBACK_PAGES, nr_to_write will be > 0. Then wb_writeback() will
> quit the background work (w/o this patch) while it's still over
> background threshold.
> 
> This will be a fairly normal/frequent case I guess.
  Ah OK, I see. I missed this case your patch set has added. Also your
changes of
        if (!wbc->for_kupdate || list_empty(&wb->b_io))
to
	if (list_empty(&wb->b_io))
are going to cause more cases when we'd hit nr_to_write > 0 (e.g. when one
pass of b_io does not write all the inodes so some are left in b_io list
and then next call to writeback finds these inodes there but there's less
than MAX_WRITEBACK_PAGES in them). Frankly, it makes me like the above
change even less. I'd rather see writeback_inodes_wb /
__writeback_inodes_sb always work on a fresh set of inodes which is
initialized whenever we enter these functions. It just seems less
surprising to me...

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ