linux-kernel - Re: [RFC 00/12] module: avoid userspace pressure on unwanted allocations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e5c2183a-f62a-6ca9-eec6-a7fab7ce4c91@redhat.com>
Date:   Tue, 28 Mar 2023 05:44:40 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Luis Chamberlain <mcgrof@...nel.org>,
        Kees Cook <keescook@...omium.org>
Cc:     linux-modules@...r.kernel.org, linux-kernel@...r.kernel.org,
        pmladek@...e.com, petr.pavlu@...e.com, prarit@...hat.com,
        christophe.leroy@...roup.eu, song@...nel.org,
        torvalds@...ux-foundation.org, dave@...olabs.net,
        fan.ni@...sung.com, vincent.fu@...sung.com,
        a.manzanares@...sung.com, colin.i.king@...il.com
Subject: Re: [RFC 00/12] module: avoid userspace pressure on unwanted
 allocations

On 24.03.23 18:54, Luis Chamberlain wrote:
> On Fri, Mar 24, 2023 at 10:27:14AM +0100, David Hildenbrand wrote:
>> On 21.03.23 20:32, David Hildenbrand wrote:
>>> On 20.03.23 22:27, Luis Chamberlain wrote:
>>>> On Mon, Mar 20, 2023 at 02:23:36PM -0700, Luis Chamberlain wrote:
>>>>> On Mon, Mar 20, 2023 at 10:15:23PM +0100, David Hildenbrand wrote:
>>>>>> Not able to reproduce with 20230319-module-alloc-opts so far (2 tries).
>>>>>
>>>>> Oh wow, so to clarify, it boots OK?
>>>>>
>>>>
>>>> Now that we know that tree works, I'm curious also now if you can
>>>> confirm just re-ordering the patches still works (it should)
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20230319-module-alloc-opts-adjust
>>>>
>>>
>>> So far no vmap errors booting the debug/kasan kernel (2 tries).
> 
> <-- snip -->
> 
>>> I think we primarily only care about systemd-udev-settle.service.
>>>
>>> That is fastest without the rcu patch (~6s), compared to with the rcu
>>> patch (~6.5s) and with stock (~7.5s -- 8s).
>>>
>>> Looks like dracut-initqueue also might be a bit faster with your changes, but
>>> maybe it's mostly noise (would have to do more runs).
>>>
>>> So maybe drop that rcu patch? But of course, there could be other scenarios where it's
>>> helpful ...
> 
> Yes I confirm the RCU patch does not help at all now also using
> stress-ng.
> 
>> Are there any other things you would like me to measure/test? I'll have to
>> hand back that test machine soonish.
> 
> Yes please test the below. Perhaps its not the final form we want, but
> it *does* fix OOM'ing when thrashing with stress-ng now with the module
> option and even with 100 threads brings down max memory consumption by
> 259 MiB. The reason is that we also vmalloc during each finit_read_file()
> for each module as well way before we even do layout_and_allocate(), and
> so obviously if we fix the module path but not this path this will eventually
> catch up with us as. I'm not at all happy with the current approach given
> ideally we'd bump the counter when the user is done with the file, but we
> don't yet have any tracking of that for users, they just vfree the memory
> itself. And so this is just trying to catch heavy immediate abuse on the
> caller to fend off abuse of vmalloc uses in a lightway manner.

Understood. (I'm planning on review one I have time some spare cycles)

> 
> There's gotta be a better way to do this, but its just an idea I have so far.
> If we *want* to keep tabs until the user is done, we have to just modify
> most users of these APIs and intrudce our own free. I don't think we're
> in a rush to fix this so maybe that's the better approach.
> 
> And so I've managed to reproduce the issues you found now with my new stress-ng
> module stressor as well.

Nice!

> 
> https://github.com/ColinIanKing/stress-ng.git
> 
> Even though you have 400 CPUs with stress-ng we can likely reproduce it
> with (use a module not loaded on your system):
> 
> ./stress-ng --module 100 --module-name xfs

I'll give that a churn on that machine with the updated patch ...

> 
> Without the patch below using 400 threads still OOMs easily due to the
> kread issue. Max threads allowed are 8192.
> 

... do you have an updated patch/branch that includes the feedback from 
Linus so I can give it a churn tomorrow?

-- 
Thanks,

David / dhildenb