lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHN2nPK+Z5cvQ_waTWyPZiEoeSc9o7e3YnQLLjRzNzrb7VhAqQ@mail.gmail.com>
Date: Thu, 18 Sep 2025 23:49:06 -0700
From: Jason Miu <jasonmiu@...gle.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>, Alexander Graf <graf@...zon.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, Baoquan He <bhe@...hat.com>, 
	Changyuan Lyu <changyuanl@...gle.com>, David Matlack <dmatlack@...gle.com>, 
	David Rientjes <rientjes@...gle.com>, Joel Granados <joel.granados@...nel.org>, 
	Marcos Paulo de Souza <mpdesouza@...e.com>, Mario Limonciello <mario.limonciello@....com>, 
	Mike Rapoport <rppt@...nel.org>, Petr Mladek <pmladek@...e.com>, 
	"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>, Steven Chen <chenste@...ux.microsoft.com>, 
	Yan Zhao <yan.y.zhao@...el.com>, kexec@...ts.infradead.org, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC v1 1/4] kho: Introduce KHO page table data structures

Hi Jason,

On Wed, Sep 17, 2025 at 9:32 AM Jason Gunthorpe <jgg@...dia.com> wrote:
>
> On Wed, Sep 17, 2025 at 12:18:39PM -0400, Pasha Tatashin wrote:
> > On Wed, Sep 17, 2025 at 8:22 AM Jason Gunthorpe <jgg@...dia.com> wrote:
> > >
> > > On Tue, Sep 16, 2025 at 07:50:16PM -0700, Jason Miu wrote:
> > > > + * kho_order_table
> > > > + * +-------------------------------+--------------------+
> > > > + * | 0 order| 1 order| 2 order ... | HUGETLB_PAGE_ORDER |
> > > > + * ++------------------------------+--------------------+
> > > > + *  |
> > > > + *  |
> > > > + *  v
> > > > + * ++------+
> > > > + * |  Lv6  | kho_page_table
> > > > + * ++------+
> > >
> > > I seem to remember suggesting this could be simplified without the
> > > special case 7h level table table for order.
> > >
> > > Encode the phys address as:
> > >
> > > (order << 51) | (phys >> (PAGE_SHIFT + order))
> >
> > Why 51 and not 52, this limits to 63bit address space, is it not?
>
> Yeah, might have got the math off
>
> > I like the idea, but I'm trying to find the benefits compared to the
> > current per-order tree approach.
>
> It is probably about half the code compared to what I see here because
> everything is agressively simplified.

Thank you very much for providing feedback to me, and I think this is
a very smart idea.

> > 3. It slightly complicates the logic in the new kernel. Instead of
> > simply iterating a known tree for a specific order, the boot-time
> > walker would need to reconstruct the per-order subtrees, and walk
> > them.
>
> The core walker just runs over a range, it is easy to compute the
> range.

I believe the "range" here refers to the specific portion of the tree
relevant to the `target_order` being restored, while the
`target_order` is the variable from 0 to MAX_PAGE_ORDER to be used in
the tree walk in the new kernel.

My current understanding of the walker for a given `target_order`:

  1. Find the `start_level` from the `target_order`. (for example,
target_order = 10, start_level = 4)
  2. The path from the root down to the level above `start_level` is
fixed (index 0 at each of these levels).
  3. At `start_level`, the index is also fixed, by (1 << (63 -
PAGE_SHIFT - order)) in a 9 bit slice.
  4. Then, for all levels *below* `order_level`, the walker iterates
through all 512 table entries, until the bitmap level.

so the "range" is the subtrees under the start_level, is my
understanding correct?

--
Jason Miu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ