linux-kernel - Re: [PATCH] fs / ext3: Always unlock updates in ext3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110822231348.GS3162@dastard>
Date:	Tue, 23 Aug 2011 09:13:48 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Pavel Machek <pavel@....cz>
Cc:	Jan Kara <jack@...e.cz>, "Rafael J. Wysocki" <rjw@...k.pl>,
	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] fs / ext3: Always unlock updates in ext3_freeze()

On Mon, Aug 22, 2011 at 03:00:45PM +0200, Pavel Machek wrote:
> Hi!
> 
> > >   What's exactly the problem? Memory preallocation enters direct reclaim
> > > and that deadlocks in the filesystem?
> > 
> > Well, that's one possible manifestation. The problem is that the
> > current hibernate code still assumes that sys_sync() results in an
> > idle filesystem that will not change after the call if nothing is
> > dirty.
> > 
> > The result is that when the large memory allocation occurs for the
> > hibernate image (after the sys_sync() call) then the shrink_slab()
> > tends to be called. The XFS shrinkers are capable of dirtying inodes
> > and the backing buffers of inodes that are in the reclaimable state.
> > But those buffers cannot be flushed to disk because hibernate has
> > already frozen the xfsbufd threads, so the shrinker doing inode
> > reclaim hangs up on locks waiting for the buffers to be written.
> > This either leads to deadlock or hibernate image allocation failure.
> > 
> > Far worse, IMO, is the case where is -doesn't- deadlock, because the
> > filesystem state can still changing after the allocation has
> > finished due to async metadata IO completions. That has the
> > potential to cause filesystem corruption as after resume the on-disk
> > state may not match what is written from memory to the hibernate
> > image.
> > 
> > The problem really isn't XFS specific, nor is it new - the fact is
> > that any filesystem that has registered a shrinker or can do async
> > work in the background post-sync is vulnerable to this problem. It's
> 
> Should we avoid calling shrinkers while hibernating?

If you like getting random OOM problems when hibernating, then go
for it.  Besides, shrinkers are used for more than just filesystems,
so you might find you screw entire classes of users by doing this
(eg everyone using intel graphics and 3D).

> Or put BUG_ON()s into filesystem shrinkers so that this can not
> happen?

Definitely not. If your concern is filesystem shrinkers and you want
a large hammer to hit the problem with then do your hibernate
image allocation wih GFP_NOFS and the filesystem shrinkers will
abort without doing anything.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/