lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170117161618.GT19699@dhcp22.suse.cz>
Date:   Tue, 17 Jan 2017 17:16:19 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Theodore Ts'o <tytso@....edu>
Cc:     Jan Kara <jack@...e.cz>, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Chinner <david@...morbit.com>, djwong@...nel.org,
        Chris Mason <clm@...com>, David Sterba <dsterba@...e.cz>,
        ceph-devel@...r.kernel.org, cluster-devel@...hat.com,
        linux-nfs@...r.kernel.org, logfs@...fs.org,
        linux-xfs@...r.kernel.org, linux-ext4@...r.kernel.org,
        linux-btrfs@...r.kernel.org, linux-mtd@...ts.infradead.org,
        reiserfs-devel@...r.kernel.org,
        linux-ntfs-dev@...ts.sourceforge.net,
        linux-f2fs-devel@...ts.sourceforge.net,
        linux-afs@...ts.infradead.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"

On Tue 17-01-17 10:59:16, Theodore Ts'o wrote:
> On Tue, Jan 17, 2017 at 04:18:17PM +0100, Michal Hocko wrote:
> > 
> > OK, so I've been staring into the code and AFAIU current->journal_info
> > can contain my stored information. I could either hijack part of the
> > word as the ref counting is only consuming low 12b. But that looks too
> > ugly to live. Or I can allocate some placeholder.
> 
> Yeah, I was looking at something similar.  Can you guarantee that the
> context will only take one or two bits?  (Looks like it only needs one
> bit ATM, even though at the moment you're storing the whole GFP mask,
> correct?)

No, I am just storing PF_MEMALLOC_NO{FS,IO} but I assume further changes
might want to pull in more changes into the scope context.

> > But before going to play with that I am really wondering whether we need
> > all this with no journal at all. AFAIU what Jack told me it is the
> > journal lock(s) which is the biggest problem from the reclaim recursion
> > point of view. What would cause a deadlock in no journal mode?
> 
> We still have the original problem for why we need GFP_NOFS even in
> ext2.  If we are in a writeback path, and we need to allocate memory,
> we don't want to recurse back into the file system's writeback path.

But we do not enter the writeback path from the direct reclaim. Or do
you mean something other than pageout()'s mapping->a_ops->writepage?
There is only try_to_release_page where we get back to the filesystems
but I do not see any NOFS protection in ext4_releasepage.

> Certainly not for the same inode, and while we could make it work if
> the mm was writing back another inode, or another superblock, there
> are also stack depth considerations that would make this be a bad
> idea.  So we do need to be able to assert GFP_NOFS even in no journal
> mode, and for any file system including ext2, for that matter.
> 
> Because of the fact that we're going to have to play games with
> current->journal_info, maybe this is something that I should take
> responsibility for, and to go through the the ext4 tree after the main
> patch series go through?

How do you see a possibility that we would handle nojournal mode on
top of "[PATCH 5/8] jbd2: mark the transaction context with the scope
GFP_NOFS context" in a separate patch?

But anyway, I agree that we should go with the API sooner rather than
later.

>   Maybe you could use xfs and ext2 as sample
> (simple) implementations?
> 
> My only ask is that the memalloc nofs context be a well defined N
> bits, where N < 16, and I'll find some place to put them (probably
> journal_info).

I am pretty sure that we won't need more than a bit or two in a
foreseeable future (I can think of GFP_NOWAIT being one candidate).
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ