[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5c2183a-f62a-6ca9-eec6-a7fab7ce4c91@redhat.com>
Date: Tue, 28 Mar 2023 05:44:40 +0200
From: David Hildenbrand <david@...hat.com>
To: Luis Chamberlain <mcgrof@...nel.org>,
Kees Cook <keescook@...omium.org>
Cc: linux-modules@...r.kernel.org, linux-kernel@...r.kernel.org,
pmladek@...e.com, petr.pavlu@...e.com, prarit@...hat.com,
christophe.leroy@...roup.eu, song@...nel.org,
torvalds@...ux-foundation.org, dave@...olabs.net,
fan.ni@...sung.com, vincent.fu@...sung.com,
a.manzanares@...sung.com, colin.i.king@...il.com
Subject: Re: [RFC 00/12] module: avoid userspace pressure on unwanted
allocations
On 24.03.23 18:54, Luis Chamberlain wrote:
> On Fri, Mar 24, 2023 at 10:27:14AM +0100, David Hildenbrand wrote:
>> On 21.03.23 20:32, David Hildenbrand wrote:
>>> On 20.03.23 22:27, Luis Chamberlain wrote:
>>>> On Mon, Mar 20, 2023 at 02:23:36PM -0700, Luis Chamberlain wrote:
>>>>> On Mon, Mar 20, 2023 at 10:15:23PM +0100, David Hildenbrand wrote:
>>>>>> Not able to reproduce with 20230319-module-alloc-opts so far (2 tries).
>>>>>
>>>>> Oh wow, so to clarify, it boots OK?
>>>>>
>>>>
>>>> Now that we know that tree works, I'm curious also now if you can
>>>> confirm just re-ordering the patches still works (it should)
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20230319-module-alloc-opts-adjust
>>>>
>>>
>>> So far no vmap errors booting the debug/kasan kernel (2 tries).
>
> <-- snip -->
>
>>> I think we primarily only care about systemd-udev-settle.service.
>>>
>>> That is fastest without the rcu patch (~6s), compared to with the rcu
>>> patch (~6.5s) and with stock (~7.5s -- 8s).
>>>
>>> Looks like dracut-initqueue also might be a bit faster with your changes, but
>>> maybe it's mostly noise (would have to do more runs).
>>>
>>> So maybe drop that rcu patch? But of course, there could be other scenarios where it's
>>> helpful ...
>
> Yes I confirm the RCU patch does not help at all now also using
> stress-ng.
>
>> Are there any other things you would like me to measure/test? I'll have to
>> hand back that test machine soonish.
>
> Yes please test the below. Perhaps its not the final form we want, but
> it *does* fix OOM'ing when thrashing with stress-ng now with the module
> option and even with 100 threads brings down max memory consumption by
> 259 MiB. The reason is that we also vmalloc during each finit_read_file()
> for each module as well way before we even do layout_and_allocate(), and
> so obviously if we fix the module path but not this path this will eventually
> catch up with us as. I'm not at all happy with the current approach given
> ideally we'd bump the counter when the user is done with the file, but we
> don't yet have any tracking of that for users, they just vfree the memory
> itself. And so this is just trying to catch heavy immediate abuse on the
> caller to fend off abuse of vmalloc uses in a lightway manner.
Understood. (I'm planning on review one I have time some spare cycles)
>
> There's gotta be a better way to do this, but its just an idea I have so far.
> If we *want* to keep tabs until the user is done, we have to just modify
> most users of these APIs and intrudce our own free. I don't think we're
> in a rush to fix this so maybe that's the better approach.
>
> And so I've managed to reproduce the issues you found now with my new stress-ng
> module stressor as well.
Nice!
>
> https://github.com/ColinIanKing/stress-ng.git
>
> Even though you have 400 CPUs with stress-ng we can likely reproduce it
> with (use a module not loaded on your system):
>
> ./stress-ng --module 100 --module-name xfs
I'll give that a churn on that machine with the updated patch ...
>
> Without the patch below using 400 threads still OOMs easily due to the
> kread issue. Max threads allowed are 8192.
>
... do you have an updated patch/branch that includes the feedback from
Linus so I can give it a churn tomorrow?
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists