linux-ext4 - Re: Memory allocation can cause ext4 filesystem to be remounted r/o

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAFy9=U5O9qQP5QU_Nw4fEbfy2oVxUub=ddYgJ_ZKXmjdChO4iA@mail.gmail.com>
Date:	Wed, 26 Jun 2013 20:50:50 +0530
From:	Nagachandra P <nagachandra@...il.com>
To:	"Theodore Ts'o" <tytso@....edu>
Cc:	Vikram MP <mp.vikram@...il.com>, linux-ext4@...r.kernel.org
Subject: Re: Memory allocation can cause ext4 filesystem to be remounted r/o

Thanks Theodore,

We also have seen case where the current allocation itself could cause
the lowmem shrinker to be called (which in-turn chooses the same
process for killing because of oom_adj_value of the current process,
oom_adj_value is a weight age value associated with each process based
on which the android low memory killer would select a process for
killing to get memory). If we chose to retry in such case we could end
up in endless loop of retrying the allocation. It would be better to
handle this without retrying.

We could your above suggestion which could address this specific path.
But, there are quiet a number of allocation in ext4 which could call
ext4_std_error on failure and we may need to look each one of them to
see on how do we handle each one of them. Do think this something that
could be done?

We have in the past tried some ugly hacks to workaround the problem
(by adjusting oom_adj_values, guarding them from being killed) but
they don't seem provide fool proof mechanism at high memory pressure
environment. Any advice on what we could try to fix the issue in
general would be appreciated?

Thanks again.

Best regards
Nagachandra

On Wed, Jun 26, 2013 at 8:24 PM, Theodore Ts'o <tytso@....edu> wrote:
> On Wed, Jun 26, 2013 at 10:02:05AM -0400, Theodore Ts'o wrote:
>>
>> In this particular case, we could reflect the error all the way up to
>> the ftruncate(2) system call.  Fixing this is going to be a bit
>> involved, unfortunately; we'll need to update a fairly large number of
>> function signatures, including ext4_truncate(), ext4_ext_truncate(),
>> ext4_free_blocks(), and a number of others.
>
> One thing that comes to mind.  If we change things so that ftruncate
> reflects an ENOMEM error all the way up to userspace, one side effect
> of this is that the file may be partially truncated when ENOMEM is
> returned.  Applications may not be prepared for this.
>
> There would be a similar issue if we do the truncate in the unlink
> call and return ENOMEM in case of a failure, the file might not be
> unlinked, and in fact we might have a partially truncated file in the
> directory, which would probably cause all sorts of confusion.  So
> we're probably better off, putting the inode on a list of inodes in
> memory, and on the orphan list on disk, and then retry the truncation
> when memory is available.  Messy, but that probably gives the best
> result for applications living constantly in high memory pressure
> environments.
>
>                                                         - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html