linux-kernel - Re: [PATCH 06/18] writeback: sync expired inodes first in background writeback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110526231045.GN5123@quack.suse.cz>
Date:	Fri, 27 May 2011 01:10:45 +0200
From:	Jan Kara <jack@...e.cz>
To:	Wu Fengguang <fengguang.wu@...el.com>
Cc:	Jan Kara <jack@...e.cz>, Andrew Morton <akpm@...ux-foundation.org>,
	Dave Chinner <david@...morbit.com>,
	Rik van Riel <riel@...hat.com>, Mel Gorman <mel@....ul.ie>,
	Christoph Hellwig <hch@...radead.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 06/18] writeback: sync expired inodes first in
 background writeback

On Wed 25-05-11 22:38:57, Wu Fengguang wrote:
> > and I was wondering: Assume there is one continuously redirtied file and
> > untar starts in parallel. With the new logic, background writeback will
> > never consider inodes that are not expired in this situation (we never
> > switch to "all dirty inodes" phase - or even if we switched, we would just
> > queue all inodes and then return back to queueing only expired inodes). So
> > the net effect is that for 30 seconds we will be only continuously writing
> > pages of the continuously dirtied file instead of (possibly older) pages of
> > other files that are written. Is this really desirable? Wasn't the old
> > behavior simpler and not worse than the new one?
> 
> Good question! Yes sadly in this case the new behavior could be worse
> than the old one.
> 
> In fact this patch do not improve the small files (< 4MB) case at all,
> except for the side effect that less unexpired inodes will leave in
> s_io when the background work quit and the later kupdate work will
> write less unexpired inodes.
> 
> And for the mixed small/large files case, it actually results in worse
> behavior on your mentioned case.
> 
> However the root cause here is the file being _actively_ written to,
> somehow a livelock scheme. We could add a simple livelock prevention
> scheme that works for the common case of file appending:
> 
> - save i_size when the range_cyclic writeback starts from 0, for
>   limiting the writeback scope
  Hmm, but for this we'd have to store additional 'unsigned long' (page
index) for each inode. Not sure if it's really worth it.

> - when range_cyclic writeback hits the saved i_size, quit the current
>   inode instead of immediately restarting from 0. This will not only
>   avoid a possible extra seek, but also redirty_tail() the inode and
>   hence get out of possible livelock.
  But I like the idea of doing redirty_tail() when we write out some inode
for too long. Maybe we could just do redirty_tail() instead of requeue_io()
whenever write_cache_pages() had to wrap the index? We could communicate
this by setting a flag in wbc in write_cache_pages()...

> The livelock prevention scheme may not only eliminate the undesirable
> behavior you observed for this patch, but also prevent the "some old
> pages may not get the chance to get written to disk in an actively
> dirtied file" data security issue discussed in an old email. What do
> you think?
  So my scheme would not solve this but it does not require per-inode
overhead...

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/