[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250407170305.GI1557073@nvidia.com>
Date: Mon, 7 Apr 2025 14:03:05 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: Pratyush Yadav <ptyadav@...zon.de>,
Changyuan Lyu <changyuanl@...gle.com>, linux-kernel@...r.kernel.org,
graf@...zon.com, akpm@...ux-foundation.org, luto@...nel.org,
anthony.yznaga@...cle.com, arnd@...db.de, ashish.kalra@....com,
benh@...nel.crashing.org, bp@...en8.de, catalin.marinas@....com,
dave.hansen@...ux.intel.com, dwmw2@...radead.org,
ebiederm@...ssion.com, mingo@...hat.com, jgowans@...zon.com,
corbet@....net, krzk@...nel.org, mark.rutland@....com,
pbonzini@...hat.com, pasha.tatashin@...een.com, hpa@...or.com,
peterz@...radead.org, robh+dt@...nel.org, robh@...nel.org,
saravanak@...gle.com, skinsburskii@...ux.microsoft.com,
rostedt@...dmis.org, tglx@...utronix.de, thomas.lendacky@....com,
usama.arif@...edance.com, will@...nel.org,
devicetree@...r.kernel.org, kexec@...ts.infradead.org,
linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
linux-mm@...ck.org, x86@...nel.org
Subject: Re: [PATCH v5 09/16] kexec: enable KHO support for memory
preservation
On Mon, Apr 07, 2025 at 07:31:21PM +0300, Mike Rapoport wrote:
> > alloc_pages is a 0 order "folio". vmalloc is an array of 0 order
> > folios (?)
>
> According to current Matthew's plan [1] vmalloc is misc memory :)
Someday! :)
> Ok, let's stick with memdesc then. Put aside the name it looks like we do
> agree that KHO needs to provide a way to preserve memory allocated from
> buddy along with some of the metadata describing that memory, like order
> for multi-order allocations.
+1
> The issue I see with bitmaps is that there's nothing except the order that
> we can save. And if sometime later we'd have to recreate memdesc for that
> memory, that would mean allocating a correct data structure, i.e. struct
> folio, struct slab, struct vmalloc maybe.
Yes. The caller would have to take care of this using a caller
specific serialization of any memdesc data. Like slab would have to
presumably record the object size and the object allocation bitmap.
> I'm not sure we are going to preserve slabs at least at the foreseeable
> future, but vmalloc seems like something that we'd have to address.
And I suspect vmalloc doesn't need to preserve any memdesc information?
It can all be recreated
> > Also the bitmap scanning to optimize the memblock reserve isn't
> > implemented for xarray.. I don't think this is representative..
>
> I believe that even with optimization of bitmap scanning maple tree would
> perform much better when the memory is not fragmented.
Hard to guess, bitmap scanning is not free, especially if there are
lots of zeros, but memory allocating maple tree nodes and locking them
is not free either so who knows where things cross over..
> And when it is fragmented both will need to call memblock_reserve()
> similar number of times and there won't be real difference. Of
> course maple tree will consume much more memory in the worst case.
Yes.
bitmaps are bounded like the comment says, 512K for 16G of memory with
arbitary order 0 fragmentation.
Assuming absolute worst case fragmentation maple tree (@24 bytes per
range, alternating allocated/freed pattern) would require around
50M. Then almost doubled since we have the maple tree and then the
serialized copy.
100Mb vs 512k - I will pick the 512K :)
Jason
Powered by blists - more mailing lists