[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAAsfc_oTmE2E8pMctiLSwMngVUbtJa4G=KAozzAfztMMc_RMOQ@mail.gmail.com>
Date: Tue, 19 Nov 2024 12:19:38 +0800
From: liequan che <liequanche@...il.com>
To: Coly Li <colyli@...e.de>
Cc: "mingzhe.zou@...ystack.cn" <mingzhe.zou@...ystack.cn>, Kent Overstreet <kent.overstreet@...il.com>,
linux-bcache <linux-bcache@...r.kernel.org>, linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] bcache:fix oops in cache_set_flush
Hi coly:
>> The same operation was performed on three servers, replacing the 12T
>> disk with a 16T disk. Only one server triggered the bug. The on-site
>What do you mean “replacing 12T disk with a 16T disk” ?
Use another 16T SATA disk to replace the 12T SATA disk.
Plan to use the 16T hard disk and the original nvme disk to recreate bcache.
>> No bcache data clearing operation was performed
>What is the “bcache data clearing operation” here?
Nothing was done. But I plan to erase the superblock after
partitioning the nvme disk.
I plan to discard the original nvme disk data by erasing the
superblock and wipe-bcache options.
>> 3. Replace the 12T SATA disk with a 16T SATA disk
>> After shutting down, unplug the 12T hard disk and replace it with a
>> 16T hard disk.
>It seems you did something bcache doesn’t support. Replace the backing device...
You are right. I may have done something that bcache does not support.
But I hope that the wrong operation will not cause the system to panic.
The consequence I can accept is that the bcache device creation fails.
The bcache module can give me a chance to erase the superblock again,
instead of entering the CD rescue mode to erase the superblock.
>> 7. Repartition again, triggering kernel panic again.
>> parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1536GiB
>> The same operation was performed on the other two servers, and no
>> panic was triggered.
>I guess this is another undefine operation. I assume the cache device is still references somewhere. A reboot should follow the wipefs.
Your guess is correct. In addition, after erasing the superblock
information in CD rescue mode,
I rebooted into the system where the original panic kernel was located.
>> The server with the problem was able to enter the system normally
>> after the root of the cache_set structure was determined to be empty.
>> I updated the description of the problem in the link below.
>No, if you clean up the partition, no cache device will exist. Cache registration won’t treat it as a bcache device.
>OK, from the above description, I see you replace the backing device (and I don’t know where the previous data was), then you extend the cache device size. They are all unsupported operations.
The behavior here is a bit strange. After partitioning, I may have
recreated the bcache device here,
which triggered the bcache rigister operation. Then the kernel panicked again.
>make-bcache -C /dev/nvme2n1p1 -B /dev/sda --writeback --force --wipe-bcache
Thanks.
Powered by blists - more mailing lists