[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zqla4Tw7YSi1pv7h@makrotopia.org>
Date: Tue, 30 Jul 2024 22:28:01 +0100
From: Daniel Golle <daniel@...rotopia.org>
To: Christoph Hellwig <hch@...radead.org>
Cc: Rob Herring <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>,
Conor Dooley <conor+dt@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Christian Brauner <brauner@...nel.org>,
Al Viro <viro@...iv.linux.org.uk>,
Li Lingfeng <lilingfeng3@...wei.com>,
Ming Lei <ming.lei@...hat.com>,
Christian Heusel <christian@...sel.eu>,
Rafał Miłecki <rafal@...ecki.pl>,
Felix Fietkau <nbd@....name>, John Crispin <john@...ozen.org>,
Chad Monroe <chad.monroe@...ran.com>,
Yangyu Chen <cyy@...self.name>,
Tianling Shen <cnsztl@...ortalwrt.org>,
Chuanhong Guo <gch981213@...il.com>,
Chen Minqiang <ptpt52@...il.com>, devicetree@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-block@...r.kernel.org
Subject: Re: [PATCH v5 3/4] block: add support for notifications
On Tue, Jul 30, 2024 at 12:36:59PM -0700, Christoph Hellwig wrote:
> Same NAK as last time. Random modules should not be able to hook
> directly into block device / partition probing.
Would using delayed_work be indirect enough for your taste?
If so, that would of course be rather easy to implement.
>
> What you want to do can be done trivially in userspace in initramfs,
> please do that as recommended multiple times before.
>
While the average desktop or server **general purpose** Linux
distribution uses an initramfs, often generated dynamically on the
target system during installation or kernel updates, this is NOT how
things are working in the embedded Linux world and for OpenWrt
specifically.
For the OpenWrt community, the great thing is that the Linux Kernel, and
even an identical userland can run on embedded devices with as little as
8 megabytes of NOR flash as well as on much more resourceful systems
with large a eMMC or even NVMe disks, but almost always just exactly one
single non-volatile storage device. All of those devices come without
complex boot firmware, so no ACPI, no UEFI, ... just U-Boot and a DT
blob which gets glued to the kernel in one way or another. And it would
of course be nice if they would all wake up with correct MAC addresses
and working WiFi, even if they come with larger (typically
block-oriented) storage. In terms of hardware such boards are often just
two or three IC packages: SoC (sometimes including RAM) and some sort
of non-volatile memory big enough to store a Linux-based firmware,
factory data (MAC addresses, WiFI calibration, serial number) and
user settings.
The same Linux Kernel source tree is also used to build kernels running
on countless large servers (and comparingly small number of desktop
systems) with complex (proprietary) boot firmware and typically a hand
full of flashes and EEPROMs on the motherboard alone. On such systems,
Ethernet NICs are dedicated chips or even PCIe cards with sometimes
even dedicated EEPROMs storing their MAC addresses. Or virtual machines
having the host taking care of all of that.
Coexistance of all those different scales, without forcing the ways of
large systems onto the small ones (and vice versa) has been a huge
strength in my opinion.
When it comes to the small (sub $100, often much less) boards for
plastic-case network appliances such as routers and access points, often
times the exact same board can be bought either with on-board SPI-NAND
(used with UBI) or an eMMC. Of course, the vendors keep things as
similar as possible, so the layout used for the NVMEM bits is often
identical, just that in one case those (typically less than a memory
page full of) bits are stored on an MTD partition or directly inside a
UBI volume, and in the other case they are stored either at a fixed
offset on the mmcblk0boot[01] device or inside a GPT partition. This is
just how reality for this class of devices already looks like today.
In previous iterations of the series I've provided multiple examples of
mainstream device vendors (Adtran, ASUS, GL.iNet, ...) to illustrate
that.
Hence I fail to understand why different rules should apply for block
devices than for EEPROMs, e-fuses, raw or SPI-connected NOR or NAND
flashes, or UBI. Especially as this is about something completely
optional, and disabled by default.
Effectively, if an interface to reference and access block-oriented
storage devices as NVMEM providers in the same way as MTD, UBI, ... is
rejected by the Linux kernel, it just means we will have to carry that
as a downstream patch in OpenWrt in order to support those devices in a
decent way. Generating a device-specific initramfs for each and every
device would not be decent imho. Carrying information about all devices
in the filesystem used on every device is also not decent. Our goal is
exactly to get rid of the board-specific switch-case Shell script
madness in userspace instead of having more of it...
Traversing DT in userspace (via /sys/firmware/) would of course be
possible, but it's often simply too late (ie. after rootfs has been
mounted, and that includes initramfs) for many use-cases (eg. nfsroot),
and it would be a redundant implementation of things which are already
implemented in the kernel. We don't like to repeat ourselves, nor do we
like to deal with board-specific details in userland.
Having a complex do-it-all initramfs like on the common x86-centric
desktop or server distribution is also not an option, it would never fit
into the storage of lower-end devices with only a few megabytes of NOR
flash. You'd need two copies of libc and busybox (one in initramfs and
one in the actual rootfs), and even the extreme case of a single static
ELF binary used as initrd would still occupy hundreds of kilobytes of
storage, and be a hell to maintain. If that sounds like very little to
you, that means you haven't been dealing with that class of devices.
Thank you for your consideration
Daniel
Powered by blists - more mailing lists