[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c91cb4df-6736-bc0f-9d33-ecc3e2f4f510@huawei.com>
Date: Fri, 29 Jan 2021 09:18:10 +0800
From: Chao Leng <lengchao@...wei.com>
To: Hannes Reinecke <hare@...e.de>, Daniel Wagner <dwagner@...e.de>
CC: <linux-nvme@...ts.infradead.org>, Sagi Grimberg <sagi@...mberg.me>,
<linux-kernel@...r.kernel.org>, Jens Axboe <axboe@...com>,
Keith Busch <kbusch@...nel.org>, Christoph Hellwig <hch@....de>
Subject: Re: [PATCH v2] nvme-multipath: Early exit if no path is available
On 2021/1/28 17:23, Hannes Reinecke wrote:
> On 1/28/21 10:18 AM, Chao Leng wrote:
>>
>>
>> On 2021/1/28 15:58, Daniel Wagner wrote:
>>> On Thu, Jan 28, 2021 at 09:31:30AM +0800, Chao Leng wrote:
>>>>> --- a/drivers/nvme/host/multipath.c
>>>>> +++ b/drivers/nvme/host/multipath.c
>>>>> @@ -221,7 +221,7 @@ static struct nvme_ns *nvme_round_robin_path(struct nvme_ns_head *head,
>>>>> }
>>>>> for (ns = nvme_next_ns(head, old);
>>>>> - ns != old;
>>>>> + ns && ns != old;
>>>> nvme_round_robin_path just be called when !"old".
>>>> nvme_next_ns should not return NULL when !"old".
>>>> It seems unnecessary to add checking "ns".
>>>
>>> The problem is when we enter nvme_round_robin_path() and there is no
>>> path available. In this case the initialization ns = nvme_next_ns(head,
>>> old) could return a NULL pointer."old" should not be NULL, so there is at least one path that is "old".
>> It is impossible to return NULL for nvme_next_ns(head, old).
>
> No. list_next_or_null_rcu()/list_first_or_null_rcu() will return NULL when then end of the list is reached.
Although list_next_or_null_rcu()/list_first_or_null_rcu() may return
NULL, but nvme_next_ns(head, old) assume that the "old" is in the "head",
so nvme_next_ns(head, old) should not return NULL. If the "old" is not
in the "head", nvme_next_ns(head, old) will run abnormal.
So there is other bug which cause nvme_next_ns(head, old).
I review the code about head->list and head->current_path, I find 2 bugs
may cause nvme_next_ns(head, old) abnormal:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengchao@huawei.com/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.
>
> Cheers,
>
> Hannes
Powered by blists - more mailing lists