linux-kernel - Re: [PATCH RFC] mm: vmalloc: do not allow kzalloc to fail

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181224081056.GD9063@dhcp22.suse.cz>
Date:   Mon, 24 Dec 2018 09:10:56 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Nicholas Mc Guire <der.herr@...r.at>
Cc:     David Rientjes <rientjes@...gle.com>,
        Nicholas Mc Guire <hofrat@...dl.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Chintan Pandya <cpandya@...eaurora.org>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Arun KS <arunks@...eaurora.org>, Joe Perches <joe@...ches.com>,
        "Luis R. Rodriguez" <mcgrof@...nel.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] mm: vmalloc: do not allow kzalloc to fail

On Sat 22-12-18 09:04:21, Nicholas Mc Guire wrote:
> On Fri, Dec 21, 2018 at 01:58:39PM -0800, David Rientjes wrote:
> > On Thu, 20 Dec 2018, Nicholas Mc Guire wrote:
> > 
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index 871e41c..1c118d7 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -1258,7 +1258,7 @@ void __init vmalloc_init(void)
> > >  
> > >  	/* Import existing vmlist entries. */
> > >  	for (tmp = vmlist; tmp; tmp = tmp->next) {
> > > -		va = kzalloc(sizeof(struct vmap_area), GFP_NOWAIT);
> > > +		va = kzalloc(sizeof(*va), GFP_NOWAIT | __GFP_NOFAIL);
> > >  		va->flags = VM_VM_AREA;
> > >  		va->va_start = (unsigned long)tmp->addr;
> > >  		va->va_end = va->va_start + tmp->size;
> > 
> > Hi Nicholas,
> > 
> > You're right that this looks wrong because there's no guarantee that va is 
> > actually non-NULL.  __GFP_NOFAIL won't help in init, unfortunately, since 
> > we're not giving the page allocator a chance to reclaim so this would 
> > likely just end up looping forever instead of crashing with a NULL pointer 
> > dereference, which would actually be the better result.
> >
> tried tracing the __GFP_NOFAIL path and had concluded that it would
> end in out_of_memory() -> panic("System is deadlocked on memory\n");
> which also should point cleanly to the cause - but I´m actually not
> that sure if that trace was correct in all cases.

No, we do not trigger the memory reclaim path nor the oom killer when
using GFP_NOWAIT. In fact the current implementation even ignores
__GFP_NOFAIL AFAICS (so I was wrong about the endless loop but I suspect
that we used to loop fpr __GFP_NOFAIL at some point in the past). The
patch simply doesn't have any effect. But the primary objection is that
the behavior might change in future and you certainly do not want to get
stuck in the boot process without knowing what is going on. Crashing
will tell you that quite obviously. Although I have hard time imagine
how that could happen in a reasonably configured system.

> > You could do
> > 
> > 	BUG_ON(!va);
> > 
> > to make it obvious why we crashed, however.  It makes it obvious that the 
> > crash is intentional rather than some error in the kernel code.
> 
> makes sense - that atleast makes it imediately clear from the code
> that there is no way out from here.

How does it differ from blowing up right there when dereferencing flags?
It would be clear from the oops.
-- 
Michal Hocko
SUSE Labs