On Thu, May 01, 2008 at 09:15:21AM -0400, Christoph Hellwig wrote: > On Thu, May 01, 2008 at 10:26:11PM +1000, David Chinner wrote: > > Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.c > > =================================================================== > > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.c 2008-04-28 16:35:23.000000000 +1000 > > +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.c 2008-05-01 20:04:55.151880341 +1000 > > @@ -2986,7 +2986,7 @@ xfs_iflush_cluster( > > ASSERT(pag->pag_ici_init); > > > > ilist_size = XFS_INODE_CLUSTER_SIZE(mp) * sizeof(xfs_inode_t *); > > - ilist = kmem_alloc(ilist_size, KM_MAYFAIL); > > + ilist = kmem_alloc(ilist_size, KM_NOFS); > > if (!ilist) > > return 0; > > This should be KM_MAYFAIL | KM_NOFS, because KM_NOFS doesn't imply that > the allocation may fail. Yes, right you are - I only looked at the effect of __GFP_FS, not what kmem_alloc does. i.e. kmem_flags_convert() doesn't do anything with KM_MAYFAIL, forgetting that it's kmem_alloc() that uses it... New patch below. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- Don't allow memory reclaim to wait on the filesystem in inode writeback If we allow memory reclaim to wait on the pages under writeback in inode cluster writeback we could deadlock because we are currently holding the ILOCK on the initial writeback inode which is needed in data I/O completion to change the file size or do unwritten extent conversion before the pages are taken out of writeback state. Signed-off-by: Dave Chinner --- fs/xfs/xfs_inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.c 2008-04-28 16:35:23.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.c 2008-05-02 08:03:30.071824780 +1000 @@ -2986,7 +2986,7 @@ xfs_iflush_cluster( ASSERT(pag->pag_ici_init); ilist_size = XFS_INODE_CLUSTER_SIZE(mp) * sizeof(xfs_inode_t *); - ilist = kmem_alloc(ilist_size, KM_MAYFAIL); + ilist = kmem_alloc(ilist_size, KM_MAYFAIL|KM_NOFS); if (!ilist) return 0;