lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <68d6df3f410de_1052010059@dwillia2-mobl4.notmuch>
Date: Fri, 26 Sep 2025 11:45:19 -0700
From: <dan.j.williams@...el.com>
To: Michał Cłapiński <mclapinski@...gle.com>,
	<dan.j.williams@...el.com>
CC: Mike Rapoport <rppt@...nel.org>, Ira Weiny <ira.weiny@...el.com>, "Dave
 Jiang" <dave.jiang@...el.com>, Vishal Verma <vishal.l.verma@...el.com>,
	<jane.chu@...cle.com>, Pasha Tatashin <pasha.tatashin@...een.com>, "Tyler
 Hicks" <code@...icks.com>, <linux-kernel@...r.kernel.org>,
	<nvdimm@...ts.linux.dev>
Subject: Re: [PATCH 1/1] nvdimm: allow exposing RAM carveouts as NVDIMM DIMM
 devices

Michał Cłapiński wrote:
[..]
> > As Mike says you would lose 128K at the end, but that indeed becomes
> > losing that 1GB given alignment constraints.
> >
> > However, I think that could be solved by just separately vmalloc'ing the
> > label space for this. Then instead of kernel parameters to sub-divide a
> > region, you just have an initramfs script to do the same.
> >
> > Does that meet your needs?
> 
> Sorry, I'm having trouble imagining this.
> If I wanted 500 1GB chunks, I would request a region of 500GB+space
> for the label? Or is that a label and info-blocks?

You would specify an memmap= range of 500GB+128K*.

Force attach that range to Mike's RAMDAX driver.

[ modprobe -r nd_e820, don't build nd_820, or modprobe policy blocks nd_e820 ]
echo ramdax > /sys/bus/platform/devices/e820_pmem/driver_override
echo e820_pmem > /sys/bus/platform/drivers/ramdax

* forget what I said about vmalloc() previously, not needed

> Then on each boot the kernel would check if there is an actual
> label/info-blocks in that space and if yes, it would recreate my
> devices (including the fsdax/devdax type)?

Right, if that range is persistent the kernel would automatically parse
the label space each boot and divide up the 500GB region space into
namespaces.

128K of label spaces gives you 509 potential namespaces.

> One of the requirements for live update is that the kexec reboot has
> to be fast. My solution introduced a delay of tens of milliseconds
> since the actual device creation is asynchronous. Manually dividing a
> region into thousands of devices from userspace would be very slow but

Wait, 500GB Region / 1GB Namespace = thousands of Namespaces?

> I would have to do that only on the first boot, right?

Yes, the expectation is only incur that overhead once. It also allows
for VMs to be able to lookup their capacity by name. So you do not need
a separate mapping of 1GB Namepsace blocks to VMs. Just give some VMs
bigger Namespaces than others by name.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ