linux-ext4 - Re: [PATCH 2/3] jbd2 : Fix journal start by passing a parameter to specify if the caller can deal with ENOMEM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110526150846.GL9520@thunk.org>
Date:	Thu, 26 May 2011 11:08:46 -0400
From:	Ted Ts'o <tytso@....edu>
To:	Jan Kara <jack@...e.cz>
Cc:	Andreas Dilger <adilger@...ger.ca>,
	Manish Katiyar <mkatiyar@...il.com>, linux-ext4@...r.kernel.org
Subject: Re: [PATCH 2/3] jbd2 : Fix journal start by passing a parameter to
 specify if the caller can deal with ENOMEM

On Thu, May 26, 2011 at 04:49:56PM +0200, Jan Kara wrote:
>   No need to do this. If you make JBD2 use a separate slab for transaction
> structures (trivial and makes some sense anyway), you can use
> fault-injection framework to do exactly what you describe above (see
> Documentation/fault-injection/fault-injection.txt and look for failslab).

Thanks for pointing me at the fault-injection framework; it's not
something I've used before.  I'll have to take a look at it.

>   But if we just fail all transaction allocations with say 10% probability,
> it should work as well, shouldn't it? We'd just retry those allocations
> whose failure we cannot handle and eventually succeed. Or do I miss
> something?

The reason why I only wanted to fail the transactions relating to the
writeback path is because other failures will get reflected back to
userspace, and would thus change the behavior of the stress test.  (If
we used fsstress, it would cause fsstress to immediately stop and
fail, for example.)

That is the one thing that worries me a little about this patch series
in general.  If we suddenly start failing open() or rename() or
chmod() syscalls with ENOMEM in low memory situations, what of
programs that aren't doing adequate error checking?  Sure, other file
systems will do this, but the bulk of the users use ext3/ext4, and
remember how much kvetching and complaining when xfs was the first
file system to require user space applications to actually use fsync()
if they wanted their files to be safe after a power failure.

I worry that there are a lot of incompetently written editors out
there that aren't doing error checking, or worse yet, package managers
or other security-critical programs that aren't doing error checking,
and which won't notice when an syscall fails in a low-memory
situation, leading to either (a) user data loss (which the application
programers will lay at the feet of the file system developers, don't
doubt it), or (b) security holes.

I'm not sure there's a way to address this concern, and I'm going not
NACK'ing this patch series on that basis --- but I do worry that it
might not improve the situation by a whole lot, and may in fact cause
some problems, at the end of the day.

	      	    	      	   	 - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html