linux-kernel - Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150818215611.GD3902@dastard>
Date:	Wed, 19 Aug 2015 07:56:11 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Jan Kara <jack@...e.cz>, Jens Axboe <axboe@...nel.dk>,
	Jan Kara <jack@...e.com>, Eryu Guan <eguan@...hat.com>,
	xfs@....sgi.com, axboe@...com, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME
 inodes

On Tue, Aug 18, 2015 at 12:54:39PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Tue, Aug 18, 2015 at 10:47:18AM -0700, Tejun Heo wrote:
> > Hmm... the only possibility I can think of is tot_write_bandwidth
> > being zero when it shouldn't be.  I've been staring at the code for a
> > while now but nothing rings a bell.  Time for another debug patch, I
> > guess.
> 
> So, I can now reproduce the bug (it takes a lot of trials but lowering
> the number of tested files helps quite a bit) and instrumented all the
> early exit paths w/o the fix patch.  bdi_has_dirty_io() and
> wb_has_dirty_io() are never out of sync with the actual dirty / io
> lists even when the test 048 fails, so the bug at least is not caused
> by writeback skipping due to buggy bdi/wb_has_dirty_io() result.
> Whenever it skips, all the lists are actually empty (verified while
> holding list_lock).
> 
> One suspicion I have is that this could be a subtle timing issue which
> is being exposed by the new short-cut path.  Anything which adds delay
> seems to make the issue go away.  Dave, does anything ring a bell?

No, it doesn't. The data writeback mechanisms XFS uses are all
generic. It marks inodes I_DIRTY_PAGES and lets the generic code
take care of everything else. Yes, we do delayed allocation during
writeback, and we log the inode size updates during IO completion,
so if inode sizes are not getting updated, then Occam's Razor
suggests that writeback is not happening.

I'd suggest looking at some of the XFS tracepoints during the test:

tracepoint			trigger
xfs_file_buffered_write		once per write syscall
xfs_file_sync			once per fsync per inode
xfs_vm_writepage		every ->writepage call
xfs_setfilesize			every IO completion that updates inode size

And it's probably best to also include all the writeback
tracepoints, too, for context. That will tell you what inodes and
what part of them are getting written back and when....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/