lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 5 Feb 2009 18:43:53 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Mikulas Patocka <mpatocka@...hat.com>
Cc:	Christoph Hellwig <hch@...radead.org>, xfs@....sgi.com,
	linux-kernel@...r.kernel.org
Subject: Re: spurious -ENOSPC on XFS

On Wed, Feb 04, 2009 at 11:31:25PM -0500, Mikulas Patocka wrote:
> > > ... and if you turn it into trylock, what are you going to do with the 
> > > inode that is just being written to? You should definitely flush it, but 
> > > trylock will skip it because it's already locked.
> > 
> > We've already flushed it directly. You disabled that code fearing
> > deadlocks. I've made it synchronous (i.e. not handed off to
> > xfssyncd) because the flush path requires us to hold the lock we are
> > already holding....
> 
> This is not "fearing deadlocks". This was getting a real deadlock:

<sigh>

Thank you for *finally* telling me exactly what the deadlock is that
you've been handwaving about for the last week. It's not a VFS
deadlock, nor is it an inode lock deadlock - its a page lock deadlock.

Perhaps next time you will post the stack trace instead of vaguely
describing a deadlock so you don't waste several hours of another
developer's time looking for deadlocks in all the wrong places?

> This one was obtained on a machine with 4k filesystem blocks, 8k pages and 
> dd bs=1 on a nearly full filesystem.

That's helpful, too. I can write a test case to exercise that.

So, now I understand why you were suggesting going all the way back up
to the top of the IO path and flushing from there - so we don't hold
a page lock.

Perhaps we should just cull the direct inode flush completely.
If that inode has any significant delayed allocation space on it,
then the only reason it gets to an ENOSPC is that is has converted
all the speculative preallocation that it already has reserved
and is trying to allocate new space. Hence flushing it will not
return any extra space.

Hmmmmm - given that we hold the iolock exclusively, the trylock I
added into xfs_sync_inodes_ag() will fail on the inode we currently
hold page locks on (tries to get iolock shared) so that should avoid
deadlock on the page we currently hold locked.  Can you remove the
direct inode flush and just run with the modified device flush to see
if that triggers the deadlock you've been seeing?

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ