[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18db3894-d128-7857-4c11-25b59d82ff54@huaweicloud.com>
Date: Fri, 12 May 2023 15:14:23 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Christoph Hellwig <hch@....de>, Yu Kuai <yukuai1@...weicloud.com>
Cc: ming.lei@...hat.com, axboe@...nel.dk, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, yi.zhang@...wei.com,
yangerkun@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH -next] block: fix blktrace debugfs entries leak
Hi,
在 2023/05/11 23:28, Christoph Hellwig 写道:
> On Thu, May 11, 2023 at 02:56:33PM +0800, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> Commit 99d055b4fd4b ("block: remove per-disk debugfs files in
>> blk_unregister_queue") moves blk_trace_shutdown() from
>> blk_release_queue() to blk_unregister_queue(), this is safe if blktrace
>> is created through sysfs, however, there are some regression in corner
>> cases:
>>
>> 1) for scsi, passthrough io can still be issued after del_gendisk, and
>> blktrace debugfs entries will be removed immediately after
>> del_gendisk(), therefor passthrough io can't be tracked and blktrace
>> will complain:
>>
>> failed read of /sys/kernel/debug/block/sdb/trace0: 5/Input/output error
>
> But that is the right thing. The only thing that has a name is the
> gendisk and it is gone at this point. Leaking the debugfs entries
> that are named after, and ultimatively associated with the gendisk
> (even if the code is still a bit confused about this) will create a lot
> of trouble for us.
>
>> 2) blktrace can still be enabled after del_gendisk() through ioctl if the
>> disk is opened before del_gendisk(), and if blktrace is not shutdown
>> through ioctl before closing the disk, debugfs entries will be
>> leaked.
>
> Yes.
>
>> It seems 1) is not important, while 2) needs to be fixed apparently.
>>
>> Fix this problem by shutdown blktrace in blk_free_queue(),
>> disk_release() is not used because scsi sg support blktrace without
>> gendisk, and this is safe because queue is not freed yet, and
>> blk_trace_shutdown() is reentrant.
>
> I think disk_release is the right place for "normal" blktrace. The
> odd cdev based blktrace for /dev/sg will need separate handling.
> To be honest I'm not even sure how /dev/sg based passthrough is
> even supposed to work in practice, but I'll need to spend some more
> time to familarize myself with it.
I'm not sure how to specail hanlde /dev/sg* for now, however,
If we don't care about blktrace for passthrough io after del_gendisk(),
and /dev/sg* has separate handling, I think it's better just to check
QUEUE_FLAG_REGISTERED in blk_trace_setup(), and don't enable blktrace
in the first place.
Thanks,
Kuai
Powered by blists - more mailing lists