linux-kernel - Re: [PATCH 0/6 v3] kvmalloc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170127100544.GF4143@dhcp22.suse.cz>
Date:   Fri, 27 Jan 2017 11:05:44 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Daniel Borkmann <daniel@...earbox.net>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>, Mel Gorman <mgorman@...e.de>,
        Johannes Weiner <hannes@...xchg.org>,
        linux-mm <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        marcelo.leitner@...il.com
Subject: Re: [PATCH 0/6 v3] kvmalloc

On Thu 26-01-17 21:34:04, Daniel Borkmann wrote:
> On 01/26/2017 02:40 PM, Michal Hocko wrote:
[...]
> > But realistically, how big is this problem really? Is it really worth
> > it? You said this is an admin only interface and admin can kill the
> > machine by OOM and other means already.
> > 
> > Moreover and I should probably mention it explicitly, your d407bd25a204b
> > reduced the likelyhood of oom for other reason. kmalloc used GPF_USER
> > previously and with order > 0 && order <= PAGE_ALLOC_COSTLY_ORDER this
> > could indeed hit the OOM e.g. due to memory fragmentation. It would be
> > much harder to hit the OOM killer from vmalloc which doesn't issue
> > higher order allocation requests. Or have you ever seen the OOM killer
> > pointing to the vmalloc fallback path?
> 
> The case I was concerned about was from vmalloc() path, not kmalloc().
> That was where the stack trace indicating OOM pointed to. As an example,
> there could be really large allocation requests for maps where the map
> has pre-allocated memory for its elements. Thus, if we get to the point
> where we need to kill others due to shortage of mem for satisfying this,
> I'd much much rather prefer to just not let vmalloc() work really hard
> and fail early on instead. 

I see, but as already mentioned, chances are that by the time you get
close to the OOM somebody else will hit the OOM before the vmalloc path
manages to free the allocated memory.

> In my (crafted) test case, I was connected
> via ssh and it each time reliably killed my connection, which is really
> suboptimal.
> 
> F.e., I could also imagine a buggy or miscalculated map definition for
> a prog that is provisioned to multiple places, which then accidentally
> triggers this. Or if large on purpose, but we crossed the line, it
> could be handled more gracefully, f.e. I could imagine an option to
> falling back to a non-pre-allocated map flavor from the application
> loading the program. Trade-off for sure, but still allowing it to
> operate up to a certain extend. Granted, if vmalloc() succeeded without
> trying hard and we then OOM elsewhere, too bad, but we don't have much
> control over that one anyway, only about our own request. Reason I
> asked above was whether having __GFP_NORETRY in would be fatal
> somewhere down the path, but seems not as you say.
> 
> So to answer your second email with the bpf and netfilter hunks, why
> not replacing them with kvmalloc() and __GFP_NORETRY flag and add that
> big fat FIXME comment above there, saying explicitly that __GFP_NORETRY
> is not harmful though has only /partial/ effect right now and that full
> support needs to be implemented in future. That would still be better
> that not having it, imo, and the FIXME would make expectations clear
> to anyone reading that code.

Well, we can do that, I just would like to prevent from this (ab)use
if there is no _real_ and _sensible_ usecase for it. Having a real bug
report or a fallback mechanism you are mentioning above would justify
the (ab)use IMHO. But that abuse would be documented properly and have a
real reason to exist. That sounds like a better approach to me.

But if you absolutely _insist_ I can change that.
-- 
Michal Hocko
SUSE Labs