[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2a9ab583-07fc-4147-949e-7c68feda82f2@linux.alibaba.com>
Date: Fri, 31 Oct 2025 18:12:05 +0800
From: Gao Xiang <hsiangkao@...ux.alibaba.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: Christoph Hellwig <hch@....de>, Jens Axboe <axboe@...nel.dk>,
Jan Kara <jack@...e.cz>, Christian Brauner <brauner@...nel.org>,
"Michael S. Tsirkin" <mst@...hat.com>, Jason Wang <jasowang@...hat.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Luis Chamberlain <mcgrof@...nel.org>, linux-block@...r.kernel.org,
Joseph Qi <joseph.qi@...ux.alibaba.com>, guanghuifeng@...ux.alibaba.com,
zongyong.wzy@...baba-inc.com, zyfjeff@...ux.alibaba.com,
"Rafael J. Wysocki" <rafael@...nel.org>, Danilo Krummrich <dakr@...nel.org>,
linux-kernel@...r.kernel.org
Subject: Re: question about bd_inode hashing against device_add() // Re:
[PATCH 03/11] block: call bdev_add later in device_add_disk
Hi Greg,
On 2025/10/31 17:58, Greg Kroah-Hartman wrote:
> On Fri, Oct 31, 2025 at 05:54:10PM +0800, Gao Xiang wrote:
>>
>>
>> On 2025/10/31 17:45, Christoph Hellwig wrote:
>>> On Fri, Oct 31, 2025 at 05:36:45PM +0800, Gao Xiang wrote:
>>>> Right, sorry yes, disk_uevent(KOBJ_ADD) is in the end.
>>>>
>>>>> Do you see that earlier, or do you have
>>>>> code busy polling for a node?
>>>>
>>>> Personally I think it will break many userspace programs
>>>> (although I also don't think it's a correct expectation.)
>>>
>>> We've had this behavior for a few years, and this is the first report
>>> I've seen.
>>>
>>>> After recheck internally, the userspace program logic is:
>>>> - stat /dev/vdX;
>>>> - if exists, mount directly;
>>>> - if non-exists, listen uevent disk_add instead.
>>>>
>>>> Previously, for devtmpfs blkdev files, such stat/mount
>>>> assumption is always valid.
>>>
>>> That assumption doesn't seem wrong.
>>
>> ;-) I was thought UNIX mknod doesn't imply the device is
>> ready or valid in any case (but dev files in devtmpfs
>> might be an exception but I didn't find some formal words)...
>> so uevent is clearly a right way, but..
>
> Yes, anyone can do a mknod and attempt to open a device that isn't
> present.
>
> when devtmpfs creates the device node, it should be there. Unless it
> gets removed, and then added back, so you could race with userspace, but
> that's not normal.
>
>>> But why does the device node
>>> get created earlier? My assumption was that it would only be
>>> created by the KOBJ_ADD uevent. Adding the device model maintainers
>>> as my little dig through the core drivers/base/ code doesn't find
>>> anything to the contrary, but maybe I don't fully understand it.
>>
>> AFAIK, device_add() is used to trigger devtmpfs file
>> creation, and it can be observed if frequently
>> hotpluging device in the VM and mount. Currently
>> I don't have time slot to build an easy reproducer,
>> but I think it's a real issue anyway.
>
> As I say above, that's not normal, and you have to be root to do this,
Just thinking out if I am a random reporter, I could
report the original symptom now because we face it,
but everyone has his own internal business or even
with limited kernel ability for example, in any
case, there is no such expectation to rush someone
into build a clean reproducer.
Nevertheless, I will take time on the reproducer, and
I think it could just add some artificial delay just
after device_add(). I could try anyway, but no rush.
> so I don't understand what you are trying to prevent happening? What is
The original report was
https://lore.kernel.org/r/43375218-2a80-4a7a-b8bb-465f6419b595@linux.alibaba.com/
> the bug and why is it just showing up now (i.e. what changed to cause
> it?)
I don't know, I think just because 6.6 is a relatively
newer kernel, and most userspace logic has retry logic
to cover this up.
Thanks,
Gao Xiang
>
> thanks,
>
> greg k-h
Powered by blists - more mailing lists