linux-kernel - Re: spurious -ENOSPC on XFS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0901231509010.5179@hs20-bc2-1.build.redhat.com>
Date:	Fri, 23 Jan 2009 15:14:30 -0500 (EST)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	Christoph Hellwig <hch@...radead.org>
cc:	xfs@....sgi.com, linux-kernel@...r.kernel.org
Subject: Re: spurious -ENOSPC on XFS



On Thu, 22 Jan 2009, Christoph Hellwig wrote:

> On Thu, Jan 22, 2009 at 03:59:13PM -0500, Christoph Hellwig wrote:
> > On Wed, Jan 21, 2009 at 10:24:22AM +1100, Dave Chinner wrote:
> > > Right, so you need to use internal xfs sync functions that don't
> > > have these problems. That is:
> > > 
> > > 	error = xfs_sync_inodes(ip->i_mount, SYNC_DELWRI|SYNC_WAIT);
> > > 
> > > will do a blocking flush of all the inodes without deadlocks occurring.
> > > Then you can remove the 500ms wait.
> > 
> > I've given this a try with Eric's testcase from #724 in the oss bugzilla,
> > but it's not enough yet.  I thinks that's because SYNC_WAIT is rather
> > meaningless for data writeout, and we need SYNC_IOWAIT instead.  The
> > patch below gets the testcase working for me:
> 
> Actually I still see failures happing sometimes.  I guess tha'ts because
> our flush is still asynchronous due to the schedule_work..

If I placed
xfs_sync_inodes(ip->i_mount, SYNC_DELWRI);
xfs_sync_inodes(ip->i_mount, SYNC_DELWRI | SYNC_IOWAIT);
directly to xfs_flush_device, I got lock dependency warning (though not a 
real deadlock).

So synchronous flushing definitely needs some thinking and lock 
rearchitecting. If you sync it asynchronously, a deadlock possibility will 
actually turn into spurious -ENOSPC (because the flushing thread will wait 
untill the main thread sleeps for 500ms, drops the lock and returns 
ENOSPC).

Mikulas

=============================================
[ INFO: possible recursive locking detected ]
2.6.29-rc1-devel #2
---------------------------------------------
mc/336 is trying to acquire lock:
 (&(&ip->i_iolock)->mr_lock){----}, at: [<0000000010186b88>] 
xfs_ilock+0x68/0xa0 [xfs]

but task is already holding lock:
 (&(&ip->i_iolock)->mr_lock){----}, at: [<0000000010186bb0>] 
xfs_ilock+0x90/0xa0 [xfs]

other info that might help us debug this:
2 locks held by mc/336:
 #0:  (&sb->s_type->i_mutex_key#8){--..}, at: [<00000000101b1560>] 
xfs_write+0x340/0x6a0 [xfs]
 #1:  (&(&ip->i_iolock)->mr_lock){----}, at: [<0000000010186bb0>] 
xfs_ilock+0x90/0xa0 [xfs]

stack backtrace:
Call Trace:
 [000000000047e924] validate_chain+0x1184/0x1460
 [000000000047ee88] __lock_acquire+0x288/0xae0
 [000000000047f734] lock_acquire+0x54/0x80
 [0000000000470b20] down_read_nested+0x40/0x60
 [0000000010186b88] xfs_ilock+0x68/0xa0 [xfs]
 [00000000101b4f78] xfs_sync_inodes_ag+0x218/0x320 [xfs]
 [00000000101b5100] xfs_sync_inodes+0x80/0x100 [xfs]
 [00000000101b51a0] xfs_flush_device+0x20/0x60 [xfs]
 [000000001018e654] xfs_flush_space+0x94/0x100 [xfs]
 [000000001018e95c] xfs_iomap_write_delay+0x13c/0x240 [xfs]
 [000000001018f100] xfs_iomap+0x280/0x300 [xfs]
 [00000000101a84d4] __xfs_get_blocks+0x74/0x260 [xfs]
 [00000000101a8724] xfs_get_blocks+0x24/0x40 [xfs]
 [00000000004e6800] __block_prepare_write+0x220/0x4e0
 [00000000004e6b68] block_write_begin+0x48/0x100
 [00000000101a785c] xfs_vm_write_begin+0x3c/0x60 [xfs]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/