[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220601190624.yd3xj2pyjyxtt634@revolver>
Date: Wed, 1 Jun 2022 19:06:31 +0000
From: Liam Howlett <liam.howlett@...cle.com>
To: Guenter Roeck <linux@...ck-us.net>,
Heiko Carstens <hca@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>
Subject: Re: [PATCH] mapletree-vs-khugepaged
* Liam R. Howlett <Liam.Howlett@...cle.com> [220531 14:56]:
> * Liam R. Howlett <Liam.Howlett@...cle.com> [220530 13:38]:
> > * Guenter Roeck <linux@...ck-us.net> [220519 17:42]:
> > > On 5/19/22 07:35, Liam Howlett wrote:
> > > > * Guenter Roeck <linux@...ck-us.net> [220517 10:32]:
> > > >
...
> > I have qemu 7.0 which seems to change the default memory size from 32MB
> > to 128MB. This can be seen on your log here:
> >
> > Memory: 27928K/32768K available (2827K kernel code, 160K rwdata, 432K rodata, 1016K init, 66K bss, 4840K reserved, 0K cma-reserved)
> >
> > With 128MB the kernel boots. With 64MB it also boots. 32MB fails with
> > an OOM. Looking into it more, I see that the OOM is caused by a
> > contiguous page allocation of 1MB (order 7 at 8K pages).
...
> > Does anyone have any idea why nommu would be getting this fragmented?
>
> Answer: Why, yes. Matthew does. Using alloc_pages_exact() means we
> allocate the huge chunk of memory then free the leftovers immediately.
> Those freed leftover pages are handed out on the next request - which
> happens to be the maple tree.
>
> It seems nommu is so close to OOMing already that this makes a
> difference. Attached is a patch which _almost_ solves the issue by
> making it less likely to use those pages, but it's still a matter of
> timing on if this will OOM anyways. It reduces the potential by a large
> margin, maybe 1/10 fail instead of 4/5 failing. This patch is probably
> worth taking on its own as it reduces memory fragmentation on
> short-lived allocations that use alloc_pages_exact().
>
> I changed the nommu code a bit to reduce memory usage as well. During a
> split even, I no longer delete then re-add the VMA and I only
> preallocate a single time for the two writes associated with a split. I
> also moved my pre-allocation ahead of the call path that does
> alloc_pages_exact(). This all but ensures we won't fragment the larger
> chunks of memory as we get enough nodes out of a single page to run at
> least through boot. However, the failure rate remained at 1/10 with
> this change.
>
> I had accepted the scenario that this all just worked before, but my
> setup is different than that of Guenter. I am using buildroot-2022.02.1
> and qemu 7.0 for my testing. My configuration OOMs 12/13 times without
> maple tree, so I think we actually lowered the memory pressure on boot
> with these changes. Obviously there is a element of timing that causes
> variation in the testing so exact numbers are not possible.
Andrew,
Please add the previous patch to the mm branch, it is not dependent on
the maple tree.
Please also include the attached patch as a fix for the maple tree nommu
OOM issue on top of "nommu: remove uses of VMA linked list". It
triggers much less for me than a straight up buildroot-2022.02.1 build
with qemu 7.0. I believe this will fix Guenter's issues with the maple
tree.
Thanks,
Liam
View attachment "0001-mm-nommu-Move-preallocations-and-limit-other-allocat.patch" of type "text/x-diff" (8620 bytes)
Powered by blists - more mailing lists