lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YWgYqNOUE/Sx7WeZ@dhcp22.suse.cz>
Date:   Thu, 14 Oct 2021 13:46:48 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     David Sterba <dsterba@...e.cz>
Cc:     Dave Chinner <david@...morbit.com>, NeilBrown <neilb@...e.de>,
        Vlastimil Babka <vbabka@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Theodore Ts'o <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        "Darrick J. Wong" <djwong@...nel.org>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>, Jonathan Corbet <corbet@....net>,
        linux-xfs@...r.kernel.org, linux-ext4@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-nfs@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        linux-doc@...r.kernel.org
Subject: Re: [PATCH 2/6] MM: improve documentation for __GFP_NOFAIL

On Thu 14-10-21 13:32:01, David Sterba wrote:
> On Wed, Oct 13, 2021 at 10:26:58AM +0200, Michal Hocko wrote:
> > > crap like this (found in btrfs):
> > > 
> > >                 /*                                                               
> > >                  * We're holding a transaction handle, so use a NOFS memory      
> > >                  * allocation context to avoid deadlock if reclaim happens.      
> > >                  */                                                              
> > >                 nofs_flag = memalloc_nofs_save();                                
> > >                 value = kmalloc(size, GFP_KERNEL);                               
> > >                 memalloc_nofs_restore(nofs_flag);                                
> > 
> > Yes this looks wrong indeed! If I were to review such a code I would ask
> > why the scope cannot match the transaction handle context. IIRC jbd does
> > that.
> 
> Adding the transaction start/end as the NOFS scope is a long term plan
> and going on for years, because it's not a change we would need in
> btrfs, but rather a favor to MM to switch away from "GFP_NOFS everywhere
> because it's easy".
> 
> The first step was to convert the easy cases. Almost all safe cases
> switching GFP_NOFS to GFP_KERNEL have happened. Another step is to
> convert GFP_NOFS to memalloc_nofs_save/GFP_KERNEL/memalloc_nofs_restore
> in contexts where we know we'd rely on the transaction NOFS scope in the
> future. Once this is implemented, the memalloc_nofs_* calls are deleted
> and it works as expected.  Now you may argue that the switch could be
> changing GFP_NOFS to GFP_KERNEL at that time but that is not that easy
> to review or reason about in the whole transaction context in all
> allocations.
> 
> This leads to code that was found in __btrfs_set_acl and called crap
> or wrong, because perhaps the background and the bigger plan is not
> immediately obvious. I hope the explanation above it puts it to the
> right perspective.

Yes it helps. Thanks for the clarification because this is far from
obvious and changelogs I've checked do not mention this high level plan.
I would have gone with a /* TODO: remove me once transactions use scopes... */
but this is obviously your call.

> 
> The other class of scoped NOFS protection is around vmalloc-based
> allocations but that's for a different reason, would be solved by the
> same transaction start/end conversion as well.
> 
> I'm working on that from time to time but this usually gets pushed down
> in the todo list. It's changing a lot of code, from what I've researched
> so far cannot be done at once and would probably introduce bugs hard to
> hit because of the external conditions (allocator, system load, ...).
> 
> I have a plan to do that incrementally, adding assertions and converting
> functions in small batches to be able to catch bugs early, but I'm not
> exactly thrilled to start such endeavour in addition to normal
> development bug hunting.
> 
> To get things moving again, I've refreshed the patch adding stubs and
> will try to find the best timing for merg to avoid patch conflicts, but
> no promises.

Thanks!

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ