[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHN2nPL2+qs9MZnOYq-Oh_D1ehOSfSLsViLa8pZ7bt4AE4zZXg@mail.gmail.com>
Date: Tue, 28 Oct 2025 19:08:27 -0700
From: Jason Miu <jasonmiu@...gle.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: Alexander Graf <graf@...zon.com>, Andrew Morton <akpm@...ux-foundation.org>,
Baoquan He <bhe@...hat.com>, Changyuan Lyu <changyuanl@...gle.com>,
David Matlack <dmatlack@...gle.com>, David Rientjes <rientjes@...gle.com>,
Jason Gunthorpe <jgg@...dia.com>, Pasha Tatashin <pasha.tatashin@...een.com>,
Pratyush Yadav <pratyush@...nel.org>, kexec@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v2 1/3] kho: Adopt KHO radix tree data structures
Hi Mike,
Thank you very much for providing your comments.
On Mon, Oct 27, 2025 at 4:53 AM Mike Rapoport <rppt@...nel.org> wrote:
>
> On Mon, Oct 20, 2025 at 03:03:04AM -0700, Jason Miu wrote:
> > Introduce a radix tree data structure for tracking preserved
> > memory pages in KHO, which will replace the current xarray-based
> > implementation.
> >
> > The primary motivation for this change is to eliminate the need for
> > serialization. By marking preserved pages directly in the new KHO
> > radix tree and passing them to the next kernel, the entire
> > serialization process can be removed. This ultimately allows for the
> > removal of the KHO finalize and abort states, simplifying the overall
> > design.
> >
> > The preserved page physical address and its order are encoded in to a
> > value. The KHO radix tree has multiple level of nodes where each node
> > is a table contining a descriptor to the next level of nodes. The
> > encoded value get split and stored its parts along the tree
> > traversal. The tree traversal ends with the `kho_bitmap_table`, where
> > each bit represents a single preserved page.
> >
> > Instead of serializing the memory map, the first kernel store the KHO
> > radix tree root in the FDT. This KHO radix tree root is passed to the
> > second kernel after kexec, hence elimitated the KHO finalize and abort
> > states.
> >
> > The second kernel walks the passed-in KHO radix tree from its root. It
> > restores the memory pages and their orders by decoding the value
> > stored in the KHO radix tree.
> >
> > This architectural shift to using a shared radix tree structure
> > simplifies the KHO design and eliminates the overhead of serializing
> > and deserializing the preserved memory map.
> >
> > Signed-off-by: Jason Miu <jasonmiu@...gle.com>
> > ---
> > MAINTAINERS | 2 +
> > include/linux/kexec_handover.h | 23 +-
> > .../linux/live_update/abi/kexec_handover.h | 10 +
> > kernel/kexec_handover.c | 861 +++++++++---------
> > 4 files changed, 458 insertions(+), 438 deletions(-)
> > create mode 100644 include/linux/live_update/abi/kexec_handover.h
> >
> > diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h
> > index 348844cffb13..bc2f9e060a79 100644
> > --- a/include/linux/kexec_handover.h
> > +++ b/include/linux/kexec_handover.h
> > @@ -45,6 +28,7 @@ int kho_preserve_folio(struct folio *folio);
> > int kho_preserve_phys(phys_addr_t phys, size_t size);
> > struct folio *kho_restore_folio(phys_addr_t phys);
> > int kho_add_subtree(struct kho_serialization *ser, const char *name, void *fdt);
> > +int kho_remove_subtree(const char *name);
>
> This change seems unrelated.
I want to add the "kho_remove_subtree()" as a counterpart of the
"kho_add_subtree()". You are right that "kho_remove_subtree()" is not
very related to this radix tree data struct change. Do you prefer I
split this kho_remove_subtree() change as another patch in this
series, or do you prefer another independent patch series? (I like the
latter option more personally)
>
> > int kho_retrieve_subtree(const char *name, phys_addr_t *phys);
> >
> > int register_kho_notifier(struct notifier_block *nb);
> > @@ -86,6 +70,11 @@ static inline int kho_retrieve_subtree(const char *name, phys_addr_t *phys)
> > return -EOPNOTSUPP;
> > }
> >
> > +static inline int kho_remove_subtree(const char *name)
> > +{
> > + return -EOPNOTSUPP;
> > +}
> > +
>
> Ditto.
>
> > static inline int register_kho_notifier(struct notifier_block *nb)
> > {
> > return -EOPNOTSUPP;
> > diff --git a/include/linux/live_update/abi/kexec_handover.h b/include/linux/live_update/abi/kexec_handover.h
> > new file mode 100644
> > index 000000000000..3154e5c33851
> > --- /dev/null
> > +++ b/include/linux/live_update/abi/kexec_handover.h
> > @@ -0,0 +1,10 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef _ABI__KEXEC_HANDOVER_H
> > +#define _ABI__KEXEC_HANDOVER_H
> > +
> > +#define KHO_FDT_COMPATIBLE "kho-v2"
> > +#define PROP_PRESERVED_PAGE_RADIX_TREE "preserved-page-radix-tree"
>
> The property describes the preserved memory, it's not important if it's a
> radix tree, a linked list or whatever else, so I don't think
> page-radix-tree is a good name.
>
> IMHO we can keep the preserved-memory-map name because version bump ensures
> we won't attempt to parse v1 data structure.
>
Sure, I can just keep the preserved-memory-map name.
> > +#define PROP_SUB_FDT "fdt"
> > +
> > +#endif
> > diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c
> > index ecd1ac210dbd..2fc5975690a7 100644
> > --- a/kernel/kexec_handover.c
> > +++ b/kernel/kexec_handover.c
> > @@ -46,143 +45,305 @@ static int __init kho_parse_enable(char *p)
> > }
> > early_param("kho", kho_parse_enable);
> >
> > +typedef int (*kho_radix_tree_walk_callback_t)(unsigned long encoded);
> > +
> > /*
> > - * Keep track of memory that is to be preserved across KHO.
>
> Please leave this sentence.
>
> > + * The KHO radix tree tracks preserved memory pages. It is a hierarchical
> > + * structure that starts with a single root `kho_radix_tree`. This single
> > + * tree stores pages of all orders.
>
> I'd combine this comment and the comment above declaration of enum
> kho_radix_consts into something like this:
>
Yes, I will move the kho_radix_consts up so its comment will be
combined and stay close to the comment of kho_radix_tree.
> /*
> * KHO tracks preserved memory using a radix tree data structure. Each node of
> * the tree is PAGE_SIZE and it is PAGE_SIZE aligned.
> *
> * The leaf nodes of the tree are bitmaps where each set bit represents a
> * single preserved page. The intermediate and root nodes of the tree are
> * tables of physical addresses that point to a lower level node:
> *
> * kho_radix_tree_root
> * +-------------------+
> * | Level 5 | (struct kho_radix_tree)
> * +-------------------+
> * |
> * v
> * +-------------------+
> * | Level 4 | (struct kho_radix_tree)
> * +-------------------+
> * |
> * | ... (intermediate levels)
> * |
> * v
> * +-------------------+
> * | Level 0 | (struct kho_bitmap_table)
> * +-------------------+
> *
> * The physical address of the preserved page and its order are encoded into a
> * single unsigned long value where higher bits represent the order and the
> * lower bits represent the page offset:
> *
> * <-- Higher Bits ------------------------------------ Lower Bits -->
> * +--------------------------+-----------------------------------------+
> * | Order Bits | Page Offset |
> * +--------------------------+-----------------------------------------+
> * |0| ... 0 0 0 1 | pa >> (PAGE_SHIFT + 0) |
> * |1| ... 0 0 0 0 1 | pa >> (PAGE_SHIFT + 1) |
> * |2| ... 0 0 0 0 0 1 | pa >> (PAGE_SHIFT + 2) |
> * +------------------------------+-------------------------------------+
> * ^
> * |
> * This single '1' bit's position uniquely identifies the 'order'.
> *
> * Page Offset:
> * The 'page offset' is the physical address normalized for its order. It
> * effectively represents the page offset for the given order.
> *
> * Order Bits:
> * The 'order bits' encode the page order by setting a single bit at a
> * specific position. The position of this bit itself represents the order.
> *
> * For instance, on a 64-bit system with 4KB pages (PAGE_SHIFT = 12), the
> * maximum range for a page offset (for order 0) is 52 bits (64 - 12). This
> * offset occupies bits [0-51]. For order 0, the order bit is set at
> * position 52.
> *
> * As the order increases, the number of bits required for the 'page offset'
> * decreases. For example, order 1 requires one less bit for its page
> * offset. This allows its order bit to be set at position 51,
> * i.e. shifting right, without conflicting with the page offset bits.
> *
> * Bitfields in this encoded value are used as indices into the tables for
> * upper levels and as bitmap index for the lowest level.
> *
> * The following diagram illustrates how the encoded value is split into
> * indices for the tree levels, with PAGE_SIZE of 4KB:
> *
> * 63:60 59:51 50:42 41:33 32:24 23:15 14:0
> * +---------+--------+--------+--------+--------+--------+-----------------+
> * | 0 | Lv 5 | Lv 4 | Lv 3 | Lv 2 | Lv 1 | Lv 0 (bitmap) |
> * +---------+--------+--------+--------+--------+--------+-----------------+
> *
> * This design stores pages of all sizes (orders) in a single 6-level table.
> * It efficiently shares lower table levels, especially due to common zero top
> * address bits, allowing a single, efficient algorithm to manage all pages.
> *
> * < a paragraph about memory consumption >
> */
>
> > + *
> > + * This is achieved by encoding the page's physical address and its order into
> > + * a single `unsigned long` value. This encoded value is then used to traverse
> > + * the tree.
>
> ...
>
> > + * Each `kho_radix_tree` (Levels 1-5) and `kho_bitmap_table` (Level 0) is
> > + * PAGE_SIZE. Each entry in a `kho_radix_tree` is a descriptor (a physical
> > + * address) pointing to the next level node. For Level 1 `kho_radix_tree`
> > + * nodes, these descriptors point to a `kho_bitmap_table`. The final
> > + * `kho_bitmap_table` is a bitmap where each set bit represents a single
> > + * preserved page.
> > */
>
> Please add an empty line between description and the declaration.
>
> > +struct kho_radix_tree {
>
> I think the name should reflect that this data structure is for tracking
> memory in KHO. Otherwise it reads like a generic data structure that can be
> used with KHO, like kho_array Pratyush proposed a while ago.
>
> Maybe struct kho_mem_radix_tree? Or even drop the 'radix', I don't think
> it's really important here.
>
I use the 'radix' here as we need a step to convert the page address
to a radix tree key. I agree that `kho_mem_radix_tree` sounds better.
> > + phys_addr_t table[PAGE_SIZE / sizeof(phys_addr_t)];
> > +};
> >
> > -#define PRESERVE_BITS (512 * 8)
> > -
> > -struct kho_mem_phys_bits {
> > - DECLARE_BITMAP(preserve, PRESERVE_BITS);
> > +struct kho_bitmap_table {
> > + unsigned long bitmaps[PAGE_SIZE / sizeof(unsigned long)];
>
> What's wrong with kho_mem_phys_bits and DECLARE_BITMAP?
> Simply adjust PRESERVE_BITS to match PAGE_SIZE.
>
I used `kho_bitmap_table` as I wanted to highlight it as a patch of
bitmap that has to be in the size of PAGE_SIZE. I found the original
`kho_mem_phys_bits` harder to understand as a bit in this bitmap is
referring to one memory page. I can update it as
#define PRESERVE_PAGE_BITS (PAGE_SIZE * 8)
struct kho_bitmap {
DECLARE_BITMAP(preserve_page, PRESERVE_PAGE_BITS);
};
what do you think?
> > };
> >
> > -struct kho_mem_phys {
> > +/*
> > + * The KHO radix tree tracks preserved pages by encoding a page's physical
>
> ...
>
> > + * address bits, allowing a single, efficient algorithm to manage all pages.
> > + */
> > +enum kho_radix_consts {
>
> I think this should be declared before the structs, so that there will be
> one long description before the enum, structs and code.
>
yes, will move this enum up.
> > + /* The bit position of a 0-order page, only supports 64bits system */
> > + ORDER_0_LG2 = 64 - PAGE_SHIFT,
> > + /* Bit number used for each level of tables */
> > + TABLE_SIZE_LG2 = const_ilog2(PAGE_SIZE / sizeof(phys_addr_t)),
> > + /* Bit number used for lowest level of bitmaps */
> > + BITMAP_SIZE_LG2 = PAGE_SHIFT + const_ilog2(BITS_PER_BYTE),
> > /*
> > - * Points to kho_mem_phys_bits, a sparse bitmap array. Each bit is sized
> > - * to order.
> > + * The total tree depth is the number of intermediate levels
> > + * and 1 bitmap level.
> > */
> > - struct xarray phys_bits;
> > + TREE_MAX_DEPTH = DIV_ROUND_UP(ORDER_0_LG2 - BITMAP_SIZE_LG2,
> > + TABLE_SIZE_LG2) + 1,
> > };
> >
> > -struct kho_mem_track {
> > - /* Points to kho_mem_phys, each order gets its own bitmap tree */
> > - struct xarray orders;
> > -};
> > +/*
> > + * `kho_radix_tree_root` points to a page thats serves as the root of the
> > + * KHO radix tree. This page is allocated during KHO module initialization.
> > + * Its physical address is written to the FDT and passed to the next kernel
> > + * during kexec.
> > + */
> > +static struct kho_radix_tree *kho_radix_tree_root;
> > +static DECLARE_RWSEM(kho_radix_tree_root_sem);
>
> I believe it's clearer to have struct kho_mem_track containing the pointer
> to root and the semaphore, e.g.
>
> struct kho_mem_track {
> struct kho_mem_tree *root;
> struct rw_semaphore sem;
> };
>
Sure.
> > -struct khoser_mem_chunk;
> > +static inline phys_addr_t kho_radix_tree_desc(struct kho_radix_tree *va)
>
> I think kho_mem_tree is a better prefix than kho_radix.
As above, I tend to use `kho_mem_radix_tree`.
>
> > +{
> > + return (phys_addr_t)virt_to_phys(va);
> > +}
> >
> > -struct kho_serialization {
> > - struct page *fdt;
> > - struct list_head fdt_list;
> > - struct dentry *sub_fdt_dir;
> > - struct kho_mem_track track;
> > - /* First chunk of serialized preserved memory map */
> > - struct khoser_mem_chunk *preserved_mem_map;
> > -};
> > +static inline struct kho_radix_tree *kho_radix_tree(phys_addr_t desc)
>
> This is not really a 'desc', this is a plain physical address, i.e. 'pa'.
> It's probably will be clearer to just use phys_to_virt() and virt_to_phys()
> directly rather than introduce these wrappers.
We had some previous discussions that using helper functions for
converting between the va and pa. But I agree with what you said that
"this is not ready a desc". How about "kho_radix_tree(phys_addr_t
node_pa)", to align with the variable name of the tree iterator below?
>
> > +{
> > + return (struct kho_radix_tree *)(phys_to_virt(desc));
> > +}
> >
> > -static void *xa_load_or_alloc(struct xarray *xa, unsigned long index, size_t sz)
> > +static struct kho_radix_tree *kho_alloc_radix_tree(void)
> > {
> > - void *elm, *res;
> > + return (struct kho_radix_tree *)get_zeroed_page(GFP_KERNEL);
> > +}
> >
> > - elm = xa_load(xa, index);
> > - if (elm)
> > - return elm;
> > +static unsigned long kho_radix_encode(phys_addr_t pa, unsigned int order)
>
> This name is really cryptic. Encode to what? Can't say I have great ideas,
> but this name should reflect that physical address and order are encoded
> into combined index into the tree.
>
> Maybe _encode_indices()?
As the "indices" can also refer to an entry of the table in a KHO mem
radix tree node. We are actually encoding a radix tree key, so may be
kho_radix_encode_key() ?
>
> > +{
> > + /* Order bits part */
> > + unsigned long h = 1UL << (ORDER_0_LG2 - order);
> > + /* Page offset part */
> > + unsigned long l = pa >> (PAGE_SHIFT + order);
> >
> > - elm = kzalloc(sz, GFP_KERNEL);
> > - if (!elm)
> > - return ERR_PTR(-ENOMEM);
> > + return h | l;
> > +}
> >
> > - res = xa_cmpxchg(xa, index, NULL, elm, GFP_KERNEL);
> > - if (xa_is_err(res))
> > - res = ERR_PTR(xa_err(res));
> > +static phys_addr_t kho_radix_decode(unsigned long encoded, unsigned int *order)
>
> The same here. Maybe
>
> _decode_indices(unsigned long indices, unsigned int *order)
>
> > +{
> > + unsigned int order_bit = fls64(encoded);
> > + phys_addr_t pa;
> >
> > - if (res) {
> > - kfree(elm);
> > - return res;
> > - }
> > + /* order_bit is numbered starting at 1 from fls64 */
> > + *order = ORDER_0_LG2 - order_bit + 1;
> > + /* The order is discarded by the shift */
> > + pa = encoded << (PAGE_SHIFT + *order);
> >
> > - return elm;
> > + return pa;
> > }
> >
> > -static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pfn,
> > - unsigned long end_pfn)
> > +static unsigned long kho_radix_get_index(unsigned long encoded,
>
> I don't like 'encoded' here and below as well.
>
If we agree with the name above, I think we can use
"kho_radix_get_index(unsigned long radix_key" here.
> > + unsigned int level)
> > {
> > - struct kho_mem_phys_bits *bits;
> > - struct kho_mem_phys *physxa;
> > -
> > - while (pfn < end_pfn) {
> > - const unsigned int order =
> > - min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn));
> > - const unsigned long pfn_high = pfn >> order;
> > + int s;
> >
> > - physxa = xa_load(&track->orders, order);
> > - if (!physxa)
> > - continue;
> > + if (level == 0)
> > + return encoded % (1 << BITMAP_SIZE_LG2);
> >
> > - bits = xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS);
> > - if (!bits)
> > - continue;
> > + s = ((level - 1) * TABLE_SIZE_LG2) + BITMAP_SIZE_LG2;
> > + return (encoded >> s) % (1 << TABLE_SIZE_LG2);
> > +}
> >
> > - clear_bit(pfn_high % PRESERVE_BITS, bits->preserve);
> > +static int kho_radix_set_bitmap(struct kho_bitmap_table *bit_tlb,
>
> ^ bitmap
Yes, let's just call it `struct kho_bitmap`.
>
> > + unsigned long offset)
> > +{
> > + if (!bit_tlb ||
> > + offset >= PAGE_SIZE * BITS_PER_BYTE)
>
> No need for two lines here
>
> > + return -EINVAL;
> >
> > - pfn += 1 << order;
> > - }
> > + __set_bit(offset, bit_tlb->bitmaps);
> > + return 0;
> > }
> >
> > -static int __kho_preserve_order(struct kho_mem_track *track, unsigned long pfn,
> > - unsigned int order)
> > +static int kho_radix_preserve_page(phys_addr_t pa, unsigned int order)
> > {
> > - struct kho_mem_phys_bits *bits;
> > - struct kho_mem_phys *physxa, *new_physxa;
> > - const unsigned long pfn_high = pfn >> order;
> > + unsigned long encoded = kho_radix_encode(pa, order);
> > + struct kho_radix_tree *current_tree, *new_tree;
>
> current_ is excessive IMHO and 'node' seems to be a better name for a tree
> iterator:
>
> struct kho_radix_tree *node;
>
> is fine, the new_tree can be declared in the scope that uses it.
>
> > + struct kho_bitmap_table *bitmap_table;
>
> _table is excessive on both the type name and variable name
>
> > + unsigned int tree_level = TREE_MAX_DEPTH - 1;
>
> Just 'level' is enough.
>
> > + unsigned int err = 0;
>
> err is signed.
>
> > + unsigned int i, idx;
> >
> > - might_sleep();
> > + down_write(&kho_radix_tree_root_sem);
> >
> > - physxa = xa_load(&track->orders, order);
> > - if (!physxa) {
> > - int err;
> > + if (!kho_radix_tree_root) {
>
> Isn't this an assert?
> Also I don't see concurrency for accessing the root, the assertion can be
> outside the lock.
>
Yes, I will add a WARN_ON here.
> > + err = -EINVAL;
> > + goto out;
> > + }
> >
> > - new_physxa = kzalloc(sizeof(*physxa), GFP_KERNEL);
> > - if (!new_physxa)
> > - return -ENOMEM;
> > + current_tree = kho_radix_tree_root;
> >
> > - xa_init(&new_physxa->phys_bits);
> > - physxa = xa_cmpxchg(&track->orders, order, NULL, new_physxa,
> > - GFP_KERNEL);
> > + /* Go from high levels to low levels */
> > + for (i = tree_level; i >= 0; i--) {
> > + idx = kho_radix_get_index(encoded, i);
> >
> > - err = xa_err(physxa);
> > - if (err || physxa) {
> > - xa_destroy(&new_physxa->phys_bits);
> > - kfree(new_physxa);
> > + if (i == 0) {
> > + bitmap_table = (struct kho_bitmap_table *)current_tree;
> > + err = kho_radix_set_bitmap(bitmap_table, idx);
>
> I'd consider making the loop until 'i > 0' and putting this outside the
> loop.
>
> > + goto out;
> > + }
> > +
>
> if (node->table[idx]) {
> node = kho_radix_tree(node->table[idx]);
> continue;
> }
>
> and reduce the indentation for allocating a new table.
>
> > + if (!current_tree->table[idx]) {
> > + new_tree = kho_alloc_radix_tree();
> > + if (!new_tree) {
> > + err = -ENOMEM;
> > + goto out;
> > + }
> >
> > - if (err)
> > - return err;
> > + current_tree->table[idx] =
> > + kho_radix_tree_desc(new_tree);
> > + current_tree = new_tree;
> > } else {
> > - physxa = new_physxa;
> > + current_tree = kho_radix_tree(current_tree->table[idx]);
> > }
> > }
> >
> > - bits = xa_load_or_alloc(&physxa->phys_bits, pfn_high / PRESERVE_BITS,
> > - sizeof(*bits));
> > - if (IS_ERR(bits))
> > - return PTR_ERR(bits);
> > +out:
> > + up_write(&kho_radix_tree_root_sem);
> > + return err;
> > +}
> > +
> > +static int kho_radix_walk_bitmaps(struct kho_bitmap_table *bit_tlb,
> > + unsigned long encoded,
> > + kho_radix_tree_walk_callback_t cb)
> > +{
> > + unsigned long *bitmap = (unsigned long *)bit_tlb;
> > + unsigned int i;
> > + int err = 0;
> > +
> > + for_each_set_bit(i, bitmap, PAGE_SIZE * BITS_PER_BYTE) {
> > + err = cb(encoded | i);
> > + if (err)
> > + return err;
> > + }
> >
> > - set_bit(pfn_high % PRESERVE_BITS, bits->preserve);
> > + return 0;
> > +}
> > +
> > +static int kho_radix_walk_trees(struct kho_radix_tree *root, unsigned int level,
>
> static int kho_radix_walk_tree(struct kho_radix_tree *parent, ...
>
> > + unsigned long start,
> > + kho_radix_tree_walk_callback_t cb)
> > +{
> > + struct kho_radix_tree *next_tree;
>
> struct kho_radix_tree *node;
>
> > + struct kho_bitmap_table *bitmap_table;
>
> ^ bitmap;
>
> > + unsigned long encoded, i;
> > + unsigned int level_shift;
>
> This can move to the scope where it's used and 'shift' would be enough.
>
> > + int err = 0;
>
> ...
>
> > +static int __kho_preserve_order(unsigned long pfn, unsigned int order)
> > +{
> > + unsigned long pa = PFN_PHYS(pfn);
> > +
> > + might_sleep();
> > +
> > + return kho_radix_preserve_page(pa, order);
>
> I don't think that __kho_preserve_order() wrapper is useful, just rename
> kho_radix_preserve_page to __kho_preserve_order()
Will merge both.
>
> > +}
> > +
> > /* almost as free_reserved_page(), just don't free the page */
> > static void kho_restore_page(struct page *page, unsigned int order)
> > {
> > @@ -224,152 +385,49 @@ struct folio *kho_restore_folio(phys_addr_t phys)
> > }
> > EXPORT_SYMBOL_GPL(kho_restore_folio);
>
> ...
>
> > -static int kho_mem_serialize(struct kho_serialization *ser)
> > +static int __init kho_radix_walk_trees_memblock_callback(unsigned long encoded)
>
> I wonder if we expect other callbacks for tree traversal?
> For now there's only one case when we traverse the tree, so it seems
> simpler to just merge the code here into kho_radix_walk_bitmaps() and
> rename kho_radix_walk_trees() and kho_radix_walk_bitmaps() to reflect that
> they are reserving the preserved memory.
>
Currently only `kho_radix_walk_trees_memblock_callback()` is used in
tree walking, but having a callback in the tree walking helps for
debugging the KHO mem tree. I can imagine that we can have different
build options for different callbacks. If there is no big objection, I
tend to keep the callback.
> > {
> > + unsigned int order;
> > + unsigned long pa;
> > + struct page *page;
> > + int sz;
> >
> > + pa = kho_radix_decode(encoded, &order);
> > + sz = 1 << (order + PAGE_SHIFT);
> > + page = phys_to_page(pa);
> >
> > + /* Reserve the memory preserved in KHO radix tree in memblock */
> > + memblock_reserve(pa, sz);
> > + memblock_reserved_mark_noinit(pa, sz);
> > + page->private = order;
> > }
> >
> > static void __init kho_mem_deserialize(const void *fdt)
>
> Well, it's not deserialize anymore.
Yeah, I did think about this name too... How about kho_mem_retrieve() ?
>
> > {
> > @@ -599,6 +657,35 @@ static int kho_debugfs_fdt_add(struct list_head *list, struct dentry *dir,
> > return 0;
> > }
>
> It seems that a lot of the changes below are not related to changing the
> mechanism for tracking the preserved memory.
>
> I'm stopping here.
>
> --
> Sincerely yours,
> Mike.
Thanks again, Mike.
Powered by blists - more mailing lists