[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130627173652.GA22107@thunk.org>
Date: Thu, 27 Jun 2013 13:36:52 -0400
From: Theodore Ts'o <tytso@....edu>
To: Nagachandra P <nagachandra@...il.com>
Cc: Vikram MP <mp.vikram@...il.com>, linux-ext4@...r.kernel.org
Subject: Re: Memory allocation can cause ext4 filesystem to be remounted r/o
On Thu, Jun 27, 2013 at 06:28:21PM +0530, Nagachandra P wrote:
> Hi Theodore,
>
> Could you point me to the code where ext4_std_err is not triggered
> because of LMK? As I see it, if a memory allocation returns error in
> some of the case ext4_std_error would invariably be called. Please
> consider the following call stack
Yes, that's one example where a memory allocation failure can lead to
ext4_std_error() getting called, and I've already acknowledged that's
one that we need to fix (although as I said, fixing it may be tricky,
short of calling congestion_wait() and then retrying the allocation,
and hoping that in the meantime the OOM killer has freed up some
memory).
If you'd could give me a list of other memory allocations where
ext4_std_error() could get called, please let me know. Note that in
the jbd2 layer, though, we handle a memory allocation failure by
retrying the allocation, to avoid this the file system getting marked
read/only. Examples of this include in jbd2_journal_write_metadata_buffer(),
and in jbd2_journal_add_journal_head() when it calls
journal_alloc_journal_head(). (Although the way we're doing the retry
in the latter case is a bit ugly and we're not sleeping with a call to
congestion_wait(), so it's something we should clean up.)
To give you an example of the intended use of ext4_std_error(), if the
journal commit code runs into a disk I/O error while writing to the
journal, the jbd2 code has to mark the journal as aborted. This could
happen because the disk has gone off-line, or the HDD has run out of
spare disk sectors in its bad block replacement pool, so it has to
return a write error to the OS. Once the journal has been marked as
aborted, the next time the ext4 code tries to access the journal, by
starting a new journal handle, or marking a metadata block dirty, the
jbd2 function will return an error, and this will cause
ext4_std_error() to be called so the file system can be marked as
requiring a file system check.
Regards,
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists