[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <7558AC97-FFFC-4593-B946-9F8FBD8D62FA@mac.com>
Date: Thu, 20 Jun 2024 22:27:55 -0600
From: Gagan Sidhu <broly@....com>
To: Zhihao Cheng <chengzhihao1@...wei.com>
Cc: Daniel Golle <daniel@...rotopia.org>,
Richard Weinberger <richard@....at>,
ZhaoLong Wang <wangzhaolong1@...wei.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
linux-mtd <linux-mtd@...ts.infradead.org>,
Miquel Raynal <miquel.raynal@...tlin.com>,
Vignesh Raghavendra <vigneshr@...com>,
yangerkun <yangerkun@...wei.com>,
yi zhang <yi.zhang@...wei.com>
Subject: Re: [PATCH v2] ubi: gluebi: Fix NULL pointer dereference caused by
ftl notifier
> On Jun 20, 2024, at 9:03 PM, Zhihao Cheng <chengzhihao1@...wei.com> wrote:
>
> 在 2024/6/21 10:09, Gagan Sidhu 写道:
>> Thanks,
>> Gagan
>>> On Jun 20, 2024, at 7:59 PM, Zhihao Cheng <chengzhihao1@...wei.com> wrote:
>>>
>>> 在 2024/6/21 6:06, Gagan Sidhu 写道:
>>>> hi zhihao,
>>>> so i assume my crude paraphrase is correct? that i may have unintentionally pointed the finger at you, but the real issue is GLUEBI existing with BLOCK on the same volume?
>>>
>>> Uhhh, I don't think I mean this. We will make it clear after getting the layers' information about your device.
>>> Everything goes well from you guys talking, this patch did reject the mtdblock loading from a gluebi device, which may lead booting failed if your rootfs depends on a mtdblock(which is generated from the gluebi device).
>>> From your description 'spoke to a user, gave him a build without MTD_GLUEBI, restoring changes made by (HAHAHA you are! huawei), it booted fine'. One thing I'm curious about, if the device boots from the mtdblock(which is generated from gluebi device), the gluebi device won't be generated because you have turned off the CONFIG_MTD_UBI_GLUEBI, then the device boots successfully, which means that your rootfs is most likely loaded from ubiblock0_0. My questions are:
>>> Q1. According to previous talking, the booting configuration is
>>> https://github.com/torvalds/linux/blob/master/drivers/mtd/mtdcore.c#L774, why the device could boot from ubiblock0_0?(it looks like that the device uses config from https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=target/linux/generic/pending-5.15/493-ubi-set-ROOT_DEV-to-ubiblock-rootfs-if-unset.patch;h=266a6331c2acc0f7c17d9ac72f54659d31b56249;hb=HEAD [pointed by Richard]). I'm unfamilar with openwrt(Maybe you and Daniel know much), is that possible the device could automatically choose boot device in the openwrt framework?
>> yes, that patch will, effectively, allow an “autoselect” of the ubi partition because it looks for the string “rootfs” in the ubi-formatted file.
>> once it finds the “rootfs” partition, it will rename it to “ubi” and that will be set as the boot/root partition
>
> Oh, maybe I know what has happened. According to the configuration of https://github.com/torvalds/linux/blob/master/drivers/mtd/mtdcore.c#L774, openwrt decides to boot rootfs from mtdblock(which is generated from the gluebi device), because the mtd char device (which is genertated from gluebi device) is found by openwrt. However, this patch stops generating mtdblock from gluebi device, so the mounting failed by missed mtdblock. After disabling the CONFIG_MTD_UBI_GLUEBI, the gluebi device is not generated, so openwrt decides to boot rootfs from ubiblock0_0, then your device booted successfully.
> The key is the rootfs device judgement in openwrt, openwrt chooses the rootfs device according to the existence of mtd char device, openwrt not check whether the corresponding mtdblock exists before mouting it. Should openwrt check the existence of mtdblock beforing using it? Or maybe openwrt could turn to use ubiblock if the mtdblock device is not found?
as i understand it, the openwrt patch requires the mtdblock device to exist before finding it. it does not rely on gluebi to generate anything.
the mtd char device is not generated from gluebi. openwrt does not use gluebi at all. i was just being paranoid and had too many options enabled.
-however, my paranoia has illuminated an issue with regards to GLUEBI and UBI_BLOCK’s coexistence.
https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=target/linux/generic/pending-5.15/400-mtd-mtdsplit-support.patch;h=46ef15d127dfb686e4458fd5838c3eaec8aa2cd7;hb=HEAD
openwrt relies on device tree, in this case with attribute “fixed-partition” and additional parameter
openwrt’s split_rootfs_dev merely requires a partition with the label rootfs to be used, and it will automatically create the rootfs (if the splitting criteria) are satisfied.
split_rootfs_dev requires, as a prerequisite, that there is a partition on the flash that can be “split”. after that, it will rely on ubi to create rootfs for boot.
-i’m a little lazy right now on explaining the details, but trust me it doesn’t need gluebi to create the block device.
>>> Q2. Take a step back, if the gluebi device and ubiblock are generated from the same UBI volume, there will be error message like 'cannot open device %d, volume %d, error -16' if we open gluebi device and ubiblock at the same time, but I cannot find the error message from the log.
>> it’s in the very first email about this topic. i posted a full bootlog. look there, it will show you that’s what happened, but i think it said error -6 not -16
>> [ 6.282125] VFS: Cannot open root device "(null)" or unknown-block(31,12): error -6
>>> Q3. If the Q1's answer is 'yes', which means that your device could automatically choose to boot rootfs from ubiblock0_0 when it cannot dectect a mtdblock(which is generated from the gluebi device), why it can fail booting before reverting this patch? I mean, the mtdblock won't be generated before reverting this patch, the device should automatically choose to boot rootfs from ubiblock0_0, but it does not, and the error message in
>> because it seems to have to do with gluebi and ubi_block doing something to the same partition.
>> config_mtd_ubi_block is not dependent on config_mtd_ubi_gluebi. i was confused about this too. i thought since both are about block devices, there is a dependency, but there isn’t.
>> effectively config_mtd_ubi_block can work without config_mtd_ubi_gluebi, but it’s only for read-only filesystems (like squashfs, which is what i’m using)
>>> Q2 is not shown in your log.
>> first email. it’s there.
>>>
>>> So, we will answer above questions if we have the layers' information about your device('mtdinfo -a' and 'ubinfo -a' after booting the device). There are two situations:
>>> 1. the layers' information after reverting this patch
>>> 2. the layers' information after disable CONFIG_MTD_UBI_GLUEBI
>>>
>> i personally think there’s some kind of interplay between CONFIG_MTD_UBI_GLUEBI and CONFIG_MTD_UBI_BLOCK. it seemed to me that gluebi re-labeled the UBI_BLOCK partition, or added the MTD_UBIVOLUME flag when it did not need to.
>> i think your question is more applicable outside of the read-only file systems situation. in my specific case, CONFIG_MTD_UBI_BLOCK suffices because of the niche filesystem i am using (squashfs, read only). i think if it was something else (like jffs2, or i am writing to this partition), then your question becomes more relevant.
>> i suspect you’ll do some thorough evaluation and provide us some much-appreciated insight, because i cannot how CONFIG_MTD_UBI_BLOCK enabled and CONFIG_MTD_UBI_GLUEBI disabled worked.
>>>> i was just joking about you being a spy by the way. it was post-fix euphoria ;)
>>>
>>> It's nothing, I knew your are not serious.
>>>> master rich, shepherd weinberger, what say ye?
>>>> your wisdom and insight would be greatly appreciated as closure and maybe even a patch to reflect the beauty of collaborating over current ;)
>>>> Thanks,
>>>> Gagan
>>>>> On Jun 17, 2024, at 10:03 PM, Zhihao Cheng <chengzhihao1@...wei.com> wrote:
>>>>>
>>>>> 在 2024/6/18 6:13, Gagan Sidhu 写道:
>>>>> Hi,
>>>>>> spoke to a user, gave him a build without MTD_GLUEBI, restoring changes made by (HAHAHA you are! huawei), it booted fine.
>>>>>
>>>>> May I have the layers' information about mtd/ubi, you can get it by 'mtdinfo -a' and 'ubinfo -a' after booting the device.
>>>>> I guess your device boots from ubiblock0_0. There are two ways loading booting device:
>>>>> 1. mtd(nand)
>>>>> ubi(holds volume ubi0_0)
>>>>> mtd12 (gluebi)
>>>>> mtdblock12 (This way is cut by this patch, so mtdblock12 is not generated, just like Daniel&Richard analyzed)
>>>>> 2. mtd(nand)
>>>>> ubi(holds volume ubi0_0)
>>>>> ubiblock0_0
>>>>>
>>>>>> so we need to think about either deprecating GLUEBI or setting an option in the Kconfig that ensures they are mutually exclusive.
>>>>>> gluebi is definitely highjacking the block device created by UBI_BLOCK and adding the MTD_UBIVOLUME flag to it.
>>>>>
>>>>> The gluebi(mtd) and ubiblock could exist on the same UBI volume at the same time, but they cannot be opened at the same time. Here is an example I tested on the local machine:
>>>>>
>>>>> ↗ ubiblock0_0
>>>>> mtd0(nandsim) -> ubi0 (holds volume ubi0_0)
>>>>> ↘ gluebi(mtd1)
>>>>>
>>>>> [root@...alhost ~]# ubinfo -a
>>>>> UBI version: 1
>>>>> Count of UBI devices: 1
>>>>> UBI control device major/minor: 10:61
>>>>> Present UBI devices: ubi0
>>>>>
>>>>> ubi0
>>>>> Volumes count: 1
>>>>> Logical eraseblock size: 126976 bytes, 124.0 KiB
>>>>> Total amount of logical eraseblocks: 8192 (1040187392 bytes, 992.0 MiB)
>>>>> Amount of available logical eraseblocks: 0 (0 bytes)
>>>>> Maximum count of volumes 128
>>>>> Count of bad physical eraseblocks: 0
>>>>> Count of reserved physical eraseblocks: 160
>>>>> Current maximum erase counter value: 2
>>>>> Minimum input/output unit size: 2048 bytes
>>>>> Character device major/minor: 246:0
>>>>> Present volumes: 0
>>>>>
>>>>> Volume ID: 0 (on ubi0)
>>>>> Type: dynamic
>>>>> Alignment: 1
>>>>> Size: 8026 LEBs (1019109376 bytes, 971.8 MiB)
>>>>> State: OK
>>>>> Name: vol_a
>>>>> Character device major/minor: 246:1
>>>>> [root@...alhost ~]# mtdinfo -a
>>>>> Count of MTD devices: 2
>>>>> Present MTD devices: mtd0, mtd1
>>>>> Sysfs interface supported: yes
>>>>>
>>>>> mtd0
>>>>> Name: NAND simulator partition 0
>>>>> Type: nand
>>>>> Eraseblock size: 131072 bytes, 128.0 KiB
>>>>> Amount of eraseblocks: 8192 (1073741824 bytes, 1024.0 MiB)
>>>>> Minimum input/output unit size: 2048 bytes
>>>>> Sub-page size: 512 bytes
>>>>> OOB size: 64 bytes
>>>>> Character device major/minor: 90:0
>>>>> Bad blocks are allowed: true
>>>>> Device is writable: true
>>>>>
>>>>> mtd1
>>>>> Name: vol_a
>>>>> Type: ubi
>>>>> Eraseblock size: 126976 bytes, 124.0 KiB
>>>>> Amount of eraseblocks: 8026 (1019109376 bytes, 971.8 MiB)
>>>>> Minimum input/output unit size: 2048 bytes
>>>>> Sub-page size: 2048 bytes
>>>>> Character device major/minor: 90:2
>>>>> Bad blocks are allowed: false
>>>>> Device is writable: true
>>>>>
>>>>> [root@...alhost ~]# lsblk | grep ubi
>>>>> ubiblock0_0 251:0 0 971.9M 0 disk
>>>>>
>>>>>> there is no other explanation.
>>>>>> looks like this was an absolutely amazing exchange that even furthered our understanding of wtf is going on.
>>>>>> thanks for being a great moderator for MTD rich
>>>>>> Thanks,
>>>>>> Gagan
>>>>>
>>>> .
>>>
>> .
>
Powered by blists - more mailing lists