[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0342fbda-9901-4293-afa7-ba6085eb1688@landley.net>
Date: Mon, 15 Sep 2025 11:43:50 -0500
From: Rob Landley <rob@...dley.net>
To: Askar Safin <safinaskar@...omail.com>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Christian Brauner <brauner@...nel.org>, Al Viro <viro@...iv.linux.org.uk>,
Jan Kara <jack@...e.cz>, Christoph Hellwig <hch@....de>,
Jens Axboe <axboe@...nel.dk>, Andy Shevchenko <andy.shevchenko@...il.com>,
Aleksa Sarai <cyphar@...har.com>,
Thomas Weißschuh <thomas.weissschuh@...utronix.de>,
Julian Stecklina <julian.stecklina@...erus-technology.de>,
Gao Xiang <hsiangkao@...ux.alibaba.com>, Art Nikpal <email2tema@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, Eric Curtin <ecurtin@...hat.com>,
Alexander Graf <graf@...zon.com>, Lennart Poettering <mzxreary@...inter.de>,
linux-arch@...r.kernel.org, linux-alpha@...r.kernel.org,
linux-snps-arc@...ts.infradead.org, linux-arm-kernel@...ts.infradead.org,
linux-csky@...r.kernel.org, linux-hexagon@...r.kernel.org,
loongarch@...ts.linux.dev, linux-m68k@...ts.linux-m68k.org,
linux-mips@...r.kernel.org, linux-openrisc@...r.kernel.org,
linux-parisc@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-riscv@...ts.infradead.org, linux-s390@...r.kernel.org,
linux-sh@...r.kernel.org, sparclinux@...r.kernel.org,
linux-um@...ts.infradead.org, x86@...nel.org, Ingo Molnar
<mingo@...hat.com>, linux-block@...r.kernel.org, initramfs@...r.kernel.org,
linux-api@...r.kernel.org, linux-doc@...r.kernel.org,
linux-efi@...r.kernel.org, linux-ext4@...r.kernel.org,
"Theodore Y . Ts'o" <tytso@....edu>, linux-acpi@...r.kernel.org,
Michal Simek <monstr@...str.eu>, devicetree@...r.kernel.org,
Luis Chamberlain <mcgrof@...nel.org>, Kees Cook <kees@...nel.org>,
Thorsten Blum <thorsten.blum@...ux.dev>, Heiko Carstens <hca@...ux.ibm.com>,
patches@...ts.linux.dev
Subject: Re: [PATCH 00/62] initrd: remove classic initrd support
On 9/12/25 17:38, Askar Safin wrote:
> Intro
> ====
> This patchset removes classic initrd (initial RAM disk) support,
> which was deprecated in 2020.
Still useful for embedded systems that can memory map flash, but it's
getting harder to find embedded developers who consider new kernels an
improvement over older ones...
> Initramfs still stays, and RAM disk itself (brd) still stays, too.
While you're at it, could you fix static/builtin initramfs so PID 1 has
a valid stdin/stdout/stderr?
A static initramfs won't create /dev/console if the embedded initramfs
image doesn't contain it, which a non-root build can't mknod, so the
kernel plumbing won't see it dev in the directory we point it at unless
we build with root access. This means the open("/dev/console") fails, so
init starts with no error reporting and we have to get far enough to
mount our own devtmpfs or similar and open our own stdout/stderr before
we can see any error output from init, which is kinda brittle.
I posted various patches to make CONFIG_DEVTMPFS_MOUNT work for initmpfs
repeatedly since 2017, which also addressed it, but the kernel
community's been hermetically sealed against outside intrusion for a
while now...
https://lkml.iu.edu/hypermail/linux/kernel/2005.1/09399.html
https://lkml.iu.edu/2302.2/05597.html
> init/do_mounts* and init/*initramfs* are listed in VFS entry in
> MAINTAINERS, so I think this patchset should go through VFS tree.
> This patchset touchs every subdirectory in arch/, so I tested it
> on 8 (!!!) archs in Qemu (see details below).
Oh hey, somebody using mkroot. Cool. :)
My current "passes basic automated smoketests" list for 6.16 is:
aarch64 armv4l armv5l armv7l i486 i686 m68k mips64 mipsel mips powerpc
powerpc64le powerpc64 riscv32 riscv64 s390x sh4 x86_64
I'm assuming that's your 8: arm, x86, m68k, mips, ppc, riscv, s390x,
superh. (The variants are mostly 32/64 bit and bit/little endian, couple
architecture generations in there. The old ones go out of patent first,
you can always tell patents are about to expire and get generic clones
when corporate shills start insisting that support for something REALLY
NEEDS TO GO AWAY RIGHT NOW...)
The or1k, microblaze, and sh4eb targets mostly work: sh4eb has broken
eithernet (never tracked down whether it's kernel or qemu that's wrong I
just know they disagree), or1k doesn't know how to exit ala
https://lists.gnu.org/archive/html/qemu-devel/2024-11/msg04522.html and
microblaze never wired up -hda to their hard drive emulation
https://lists.nongnu.org/archive/html/qemu-devel/2025-01/msg01149.html
but I haven't had the spoons to argue with IBM Hat developers about
procedure compliance auditing.
I need to track down a decent qemu emulation for armv7m, last time I
tried with vanilla was https://landley.net/notes-2023.html#23-02-2023
which was not promising, I downloaded a pic32 qemu fork last week, but
haven't had the spoons to follow up on that either. Or to ship a new
toybox/mkroot release: I've had 6.16 kernel patches since the week it
came out, unbreaking powerpc and adding fdpic support to sh4-mmu, but
hobbyist friendly this community ain't. Sigh, I should get back on the
(beating a dead) horse...
I had hexagon userspace working for a while ("qemu-hexagon ls -l") but
no kernel for it: Taylor Simpson said he was going to post a
qemu-system-hexagon patchset with a comet board emulation, but that
architecture has no gcc support (there was a gcc fork on code aurora but
they abandoned it when the FSF went gplv3) so it needs an llvm-only
toolchain build with a non-vanilla musl libc fork... Honestly the
problem is compiler-rt sucks rocks: I should cycle back around to
https://landley.net/notes-2021.html#28-07-2021 but just haven't.
(Although part of the "Just haven't" is that I posted a patch to lkml
making generic $CROSS_COMPILE prefixes automatically work whether your
toolchain was gcc or llvm, and the response was literally "we decided to
manually specify LLVM= on the command line so you must always do that
and we're refusing your two line fix to NOT need to do that". No really:
https://lkml.iu.edu/2302.2/08170.html
> Warning: this patchset renames CONFIG_BLK_DEV_INITRD (!!!) to CONFIG_INITRAMFS
> and CONFIG_RD_* to CONFIG_INITRAMFS_DECOMPRESS_* (for example,
> CONFIG_RD_GZIP to CONFIG_INITRAMFS_DECOMPRESS_GZIP).
> If you still use initrd, see below for workaround.
Which will break existing configs for what benefit?
I'm not convinced the churn improves matters. Presumably the kernel
command line paremeter is still rdinit= and grub still uses the "initrd"
command to load an external cpio.gz.
But I bisect to find breakage like that every release so I assume the
other embedded linux developers... are mostly shipping 10+ year old
kernels that use half the memory of today's.
> Details
> ====
> I not only removed initrd, I also removed a lot of code, which
> became dead, including a lot of code in arch/.
>
> Still I think the only two architectures I touched in non-trivial
> way are sh and 32-bit arm.
>
> Also I renamed some files, functions and variables (which became misnomers) to proper names,
> moved some code around, removed a lot of mentions of initrd
> in code and comments. Also I cleaned up some docs.
Now that lkml.iu.edu is back up (yay!) all the links in
ramfs-rootfs-initramfs.txt can theoretically be fixed just by switching
the domain name.
> For example, I renamed the following global variables:
>
> __initramfs_start
> __initramfs_size
That already said initramfs, and you renamed it.
> phys_initrd_start
> phys_initrd_size
> initrd_start
> initrd_end
Which is data delivered through grub's "initrd" command. Here's how I've
been explaining it to people for years:
1) initrd is the external blob from the bootloader's initrd= option.
2) initramfs is the extractor plumbing, _init code that gets discarded.
3) rootfs is (for some reason) the name of the mounted filesystem in
/proc/mounts (because letting it say "ramfs" or "tmpfs" like normal in
/proc/mounts would be consistent and immediately understandable, so they
couldn't have that).
(No I don't know why it's called rootfs. Having things like df not show
overmounted filesystems isn't special case logic, why...? The argument
to special case this because you can't unmount it is like saying PID 1
shouldn't have a number because it can't exit. I would happily call the
whole thing initramfs... but it's already not.)
> to:
>
> __builtin_initramfs_start
> __builtin_initramfs_size
> phys_external_initramfs_start
> phys_external_initramfs_size
> virt_external_initramfs_start
> virt_external_initramfs_end
Do you believe people will understand what the slightly longer names are
without looking them up?
I'm all for removing obsolete code, but a partial cleanup that still
leaves various sharp edges around isn't necessarily a net improvement.
Did you remove the NFS mount code from init/do_mounts.c? Part of the
initramfs justification back in 2005 was "you can have a tiny initramfs
set up our root filesystem so most of the init special casing can go"...
and then they added CONFIG_DEVTMPFS_MOUNT but made it ONLY apply to the
fallback root after the system has decided NOT to stay on rootfs, and
ignored my patches to at least make it consistent.
The one config symbol that really seems to bite people in this area is
BLK_DEV_INITRD because a common thing people running from initramfs want
to do is yank the block layer entirely (CONFIG_BLOCK=n) and use
initramfs instead, and needing to enable CONFIG_BLK_DEV_INITRD while
And the INSANE part is they generally want a static initrd to do it so
they're not using the external loader, but Kconfig has INITRAMFS_SOURCE
under CONFIG_BLK_DEV_INITRD and it's a mess. Renaming THAT symbol would
be good.
But then, CONFIG_BLOCK is hidden under CONFIG_EXPERT which selects
DEBUG_KERNEL (INCREASING KERNEL SIZE!!!) and thus everybody who does
this patches the kconfig plumbing to be less stupid anyway. So the
problem isn't JUST renaming the symbol...
(Oh CONFIG_EXPERT is SO STUPID. It's got a menu under it, but
CONFIG_BLOCK isn't in that menu, it's at the top of menuconfig between
loadable module support and executable file formats, just invisible
unless you go down into a menu and switch on a setting and then back out
to go find it. WHY WOULD YOU DO THAT?)
> New names precisely capture meaning of these variables.
To you. I'm not entirely sure what virt_external means. (Yes I could go
read the code. No I don't want to. I EXPECT to need context and
refreshing stuff, but having it change out from under me since the LAST
time I did that is annoying when it's "same thing, new name, because".)
It makes more sense to YOU because you changed it to smell like you.
Meanwhile 35 years of installed base expertise in other people's heads
has been discarded and developed version skew for anyone maintaining an
existing system. (That's not a "never do this", that's a "be aware
humans consistently have the wrong weightings in our heads for this".)
Personally I usually have to look it up either way. And am spending more
and more of my time poking at older kernels rather because newer stuff
has either removed support for things I need or grown dependencies. (And
because there's 20 years of installed base still in various stages of
use, I'm personally likely to spend more time looking at the old names
than the new ones.)
> This will break all configs out there (update your configs!).
> Still I think this is okay,
Because you don't have to clean up after it.
> because config names never were part of stable API.
I can forward everyone who asks me questions to you, or just agree when
they tell me it's yet another reason not to upgrade.
> Other user-visible changes:
>
> - Removed kernel command line parameters "load_ramdisk" and
> "prompt_ramdisk", which did nothing and were deprecated
Sure.
> - Removed kernel command line parameter "ramdisk_start",
> which was used for initrd only (not for initramfs)
Some bootloaders appended that to the kernel command line to specify
where in memory they've loaded the initrd image, which could be a
cpio.gz once upon a time. No idea what regressions happened since though.
(Last new bootloader I was involved with that had to make it work used
some horrible hack editing a dtb at a fixed offset, like the old "rdev"
trick but more brittle. Because "device tree better" than human readable
textual mechanism. Fixing ramdisk_start to work right sounded like a
more sane approach to me, but...)
> I tested my patchset on many architectures in Qemu using my Rust
> program, heavily based on mkroot [1].
You rewrote a 400 line bash script in rust.
Yeah, that's a rust developer. (And it smells like you now...)
> I used the following cross-compilers:
>
> aarch64-linux-musleabi
> armv4l-linux-musleabihf
> armv5l-linux-musleabihf
> armv7l-linux-musleabihf
> i486-linux-musl
> i686-linux-musl
> mips-linux-musl
> mips64-linux-musl
> mipsel-linux-musl
> powerpc-linux-musl
> powerpc64-linux-musl
> powerpc64le-linux-musl
> riscv32-linux-musl
> riscv64-linux-musl
> s390x-linux-musl
> sh4-linux-musl
> sh4eb-linux-musl
> x86_64-linux-musl
or1k and microblaze work, they just don't pass the full smoketest for
reasons that shouldn't affect initramfs testing.
I'm still waiting for Rich to ship the next musl release to do new
toolchains...
https://www.openwall.com/lists/musl/2025/08/04/1
> Workaround
> ====
> If "retain_initrd" is passed to kernel, then initramfs/initrd,
> passed by bootloader, is retained and becomes available after boot
> as read-only magic file /sys/firmware/initrd [3].
Common use case for eg romfs is memory mapped flash or rom, so the
address range in question isn't actually ram anyway. Mostly on mmu
systems you just don't want the mapping to go away, so the kernel can
still reach out and read it.
> This is even better than classic initrd, because:
> - You can use fs not supported by classic initrd, for example erofs
Network block device was the most recent one I saw used, but it had a
tiny initramfs to set up and switch_root into it...
(Network block device != network filesystem. I have a todo item to
integrate nbd-server into mkroot/testroot.sh but "-hda works" is one of
the things it's testing...)
> - One copy is involved (from /sys/firmware/initrd to some file in /)
> as opposed to two when using classic initrd
Embedded developers have always been reaching out and using mappable
flash directly. Vitaly Wool's ELC talk in 2015 (about running Linux in
256k of sram, yes one quarter of one megabyte) described the process:
https://elinux.org/images/9/90/Linux_for_Microcontrollers-_From_Marginal_to_Mainstream.pdf
Rob
Powered by blists - more mailing lists