linux-kernel - Re: [PATCH] fs: Fix busyloop in wb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090921130145.GA6266@localhost>
Date:	Mon, 21 Sep 2009 21:01:45 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Jens Axboe <jens.axboe@...cle.com>
Cc:	Jan Kara <jack@...e.cz>, LKML <linux-kernel@...r.kernel.org>,
	Theodore Tso <tytso@....edu>
Subject: Re: [PATCH] fs: Fix busyloop in wb_writeback()

On Thu, Sep 17, 2009 at 02:41:06AM +0800, Jens Axboe wrote:
> On Wed, Sep 16 2009, Jan Kara wrote:
> > If all inodes are under writeback (e.g. in case when there's only one inode
> > with dirty pages), wb_writeback() with WB_SYNC_NONE work basically degrades
> > to busylooping until I_SYNC flags of the inode is cleared. Fix the problem by
> > waiting on I_SYNC flags of an inode on b_more_io list in case we failed to
> > write anything.
> 
> Interesting, so this will happen if the dirtier and flush thread end up
> "fighting" each other over the same inode. I'll throw this into the
> testing mix.
> 
> How did you notice?

Jens, I found another busy loop. Not sure about the solution, but here
is the quick fact.

Tested git head is 1ef7d9aa32a8ee054c4d4fdcd2ea537c04d61b2f, which
seems to be the last writeback patch in the linux-next tree. I cannot
run the plain head of linux-next because it just refuses boot up.

On top of which Jan Kara's I_SYNC waiting patch and the attached
debugging patch is applied.

Test commands are:

        # mount /mnt/test # ext4 fs
        # echo 1 > /proc/sys/fs/dirty_debug

        # cp /dev/zero /mnt/test/zero0

After that the box is locked up, the system is busy doing these:

[   54.740295] requeue_io() +457: inode=79232
[   54.740300] mm/page-writeback.c +539 balance_dirty_pages(): comm=cp pid=3327 n=0
[   54.740303] global dirty=60345 writeback=10145 nfs=0 flags=_M towrite=1536 skipped=0
[   54.740317] requeue_io() +457: inode=79232
[   54.740322] mm/page-writeback.c +539 balance_dirty_pages(): comm=cp pid=3327 n=0
[   54.740325] global dirty=60345 writeback=10145 nfs=0 flags=_M towrite=1536 skipped=0
[   54.740339] requeue_io() +457: inode=79232
[   54.740344] mm/page-writeback.c +539 balance_dirty_pages(): comm=cp pid=3327 n=0
[   54.740347] global dirty=60345 writeback=10145 nfs=0 flags=_M towrite=1536 skipped=0
......

Basically the traces show that balance_dirty_pages() is busy looping.
It cannot write anything because the inode always be requeued by this line:

        if (inode->i_state & I_SYNC) {
               if (!wait) {
                        requeue_io(inode);
                        return 0;
                }

This seem to happen when the partition is FULL.

Thanks,
Fengguang

View attachment "wb-debug.patch" of type "text/x-diff" (5135 bytes)