linux-kernel - Re: [PATCH 0/6 v3] kvmalloc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <588BA9AA.8010805@iogearbox.net>
Date:   Fri, 27 Jan 2017 21:12:26 +0100
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Michal Hocko <mhocko@...nel.org>
CC:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>, Mel Gorman <mgorman@...e.de>,
        Johannes Weiner <hannes@...xchg.org>,
        linux-mm <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        marcelo.leitner@...il.com
Subject: Re: [PATCH 0/6 v3] kvmalloc

On 01/27/2017 11:05 AM, Michal Hocko wrote:
> On Thu 26-01-17 21:34:04, Daniel Borkmann wrote:
>> On 01/26/2017 02:40 PM, Michal Hocko wrote:
> [...]
>>> But realistically, how big is this problem really? Is it really worth
>>> it? You said this is an admin only interface and admin can kill the
>>> machine by OOM and other means already.
>>>
>>> Moreover and I should probably mention it explicitly, your d407bd25a204b
>>> reduced the likelyhood of oom for other reason. kmalloc used GPF_USER
>>> previously and with order > 0 && order <= PAGE_ALLOC_COSTLY_ORDER this
>>> could indeed hit the OOM e.g. due to memory fragmentation. It would be
>>> much harder to hit the OOM killer from vmalloc which doesn't issue
>>> higher order allocation requests. Or have you ever seen the OOM killer
>>> pointing to the vmalloc fallback path?
>>
>> The case I was concerned about was from vmalloc() path, not kmalloc().
>> That was where the stack trace indicating OOM pointed to. As an example,
>> there could be really large allocation requests for maps where the map
>> has pre-allocated memory for its elements. Thus, if we get to the point
>> where we need to kill others due to shortage of mem for satisfying this,
>> I'd much much rather prefer to just not let vmalloc() work really hard
>> and fail early on instead.
>
> I see, but as already mentioned, chances are that by the time you get
> close to the OOM somebody else will hit the OOM before the vmalloc path
> manages to free the allocated memory.
>
>> In my (crafted) test case, I was connected
>> via ssh and it each time reliably killed my connection, which is really
>> suboptimal.
>>
>> F.e., I could also imagine a buggy or miscalculated map definition for
>> a prog that is provisioned to multiple places, which then accidentally
>> triggers this. Or if large on purpose, but we crossed the line, it
>> could be handled more gracefully, f.e. I could imagine an option to
>> falling back to a non-pre-allocated map flavor from the application
>> loading the program. Trade-off for sure, but still allowing it to
>> operate up to a certain extend. Granted, if vmalloc() succeeded without
>> trying hard and we then OOM elsewhere, too bad, but we don't have much
>> control over that one anyway, only about our own request. Reason I
>> asked above was whether having __GFP_NORETRY in would be fatal
>> somewhere down the path, but seems not as you say.
>>
>> So to answer your second email with the bpf and netfilter hunks, why
>> not replacing them with kvmalloc() and __GFP_NORETRY flag and add that
>> big fat FIXME comment above there, saying explicitly that __GFP_NORETRY
>> is not harmful though has only /partial/ effect right now and that full
>> support needs to be implemented in future. That would still be better
>> that not having it, imo, and the FIXME would make expectations clear
>> to anyone reading that code.
>
> Well, we can do that, I just would like to prevent from this (ab)use
> if there is no _real_ and _sensible_ usecase for it. Having a real bug

Understandable.

> report or a fallback mechanism you are mentioning above would justify
> the (ab)use IMHO. But that abuse would be documented properly and have a
> real reason to exist. That sounds like a better approach to me.
>
> But if you absolutely _insist_ I can change that.

Yeah, please do (with a big FIXME comment as mentioned), this originally
came from a real bug report. Anyway, feel free to add my Acked-by then.

Thanks again,
Daniel