[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e26d37bc-0f09-426a-ef25-57bdbd716ae9@huaweicloud.com>
Date: Tue, 30 May 2023 10:07:54 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Yu Kuai <yukuai1@...weicloud.com>, Christoph Hellwig <hch@....de>
Cc: ming.lei@...hat.com, axboe@...nel.dk, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, yi.zhang@...wei.com,
yangerkun@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH -next] block: fix blktrace debugfs entries leak
Hi, Christoph
在 2023/05/12 15:14, Yu Kuai 写道:
> Hi,
>
> 在 2023/05/11 23:28, Christoph Hellwig 写道:
>> On Thu, May 11, 2023 at 02:56:33PM +0800, Yu Kuai wrote:
>>> From: Yu Kuai <yukuai3@...wei.com>
>>>
>>> Commit 99d055b4fd4b ("block: remove per-disk debugfs files in
>>> blk_unregister_queue") moves blk_trace_shutdown() from
>>> blk_release_queue() to blk_unregister_queue(), this is safe if blktrace
>>> is created through sysfs, however, there are some regression in corner
>>> cases:
>>>
>>> 1) for scsi, passthrough io can still be issued after del_gendisk, and
>>> blktrace debugfs entries will be removed immediately after
>>> del_gendisk(), therefor passthrough io can't be tracked and blktrace
>>> will complain:
>>>
>>> failed read of /sys/kernel/debug/block/sdb/trace0: 5/Input/output
>>> error
>>
>> But that is the right thing. The only thing that has a name is the
>> gendisk and it is gone at this point. Leaking the debugfs entries
>> that are named after, and ultimatively associated with the gendisk
>> (even if the code is still a bit confused about this) will create a lot
>> of trouble for us.
>>
>>> 2) blktrace can still be enabled after del_gendisk() through ioctl if
>>> the
>>> disk is opened before del_gendisk(), and if blktrace is not shutdown
>>> through ioctl before closing the disk, debugfs entries will be
>>> leaked.
>>
>> Yes.
>>
>>> It seems 1) is not important, while 2) needs to be fixed apparently.
>>>
>>> Fix this problem by shutdown blktrace in blk_free_queue(),
>>> disk_release() is not used because scsi sg support blktrace without
>>> gendisk, and this is safe because queue is not freed yet, and
>>> blk_trace_shutdown() is reentrant.
>>
>> I think disk_release is the right place for "normal" blktrace. The
>> odd cdev based blktrace for /dev/sg will need separate handling.
>> To be honest I'm not even sure how /dev/sg based passthrough is
>> even supposed to work in practice, but I'll need to spend some more
>> time to familarize myself with it.
>
> I'm not sure how to specail hanlde /dev/sg* for now, however,
> If we don't care about blktrace for passthrough io after del_gendisk(),
> and /dev/sg* has separate handling, I think it's better just to check
> QUEUE_FLAG_REGISTERED in blk_trace_setup(), and don't enable blktrace
> in the first place.
Any suggestions about this problem? Should we use separate handling for
/dev/sd? Or just free blktrace in blk_free_queue().
>
> Thanks,
> Kuai
>
> .
>
Powered by blists - more mailing lists