lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d6845b29-dcce-4aaf-0093-b658bf6cdf32@huawei.com>
Date:   Wed, 9 Nov 2022 15:41:07 +0800
From:   Chao Leng <lengchao@...wei.com>
To:     Sagi Grimberg <sagi@...mberg.me>, Christoph Hellwig <hch@....de>,
        Gerd Bayer <gbayer@...ux.ibm.com>
CC:     Jens Axboe <axboe@...com>,
        Niklas Schnelle <schnelle@...ux.ibm.com>,
        <linux-kernel@...r.kernel.org>, <linux-next@...r.kernel.org>,
        <linux-nvme@...ts.infradead.org>
Subject: Re: nvme-pci: NULL pointer dereference in nvme_dev_disable() on
 linux-next

The check "if (!ctrl->tagset)" is just reduce the probability.

The real reason is the race of probe and remove.
It is similar with TCP and RDMA transport.
Israel has tried to fix it.
The detail:
https://github.com/torvalds/linux/commit/ce1518139e6976cf19c133b555083354fdb629b8
Unfortunately, this patch was reverted.

If it is in the process of "probe", remove should not be called.
Maybe we can move pci_set_drvdata to the end of nvme_probe.
Of course, the removal may not take effect if it is in the process of "probe".
This is why the patch of Israel is reverted.

Perhaps the better option would be that "remove" wait for the "probe" to complete,
and then do the real remove.
This requires additional mechanism to implement this.

On 2022/11/9 10:54, Sagi Grimberg wrote:
> 
>> Below is the minimal fix.  I'll see if I sort out the mess that is
>> probe/reset failure vs ->remove a bit better, though.
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index f94b05c585cbc..577bacdcfee08 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -5160,6 +5160,8 @@ EXPORT_SYMBOL_GPL(nvme_start_freeze);
>>   void nvme_stop_queues(struct nvme_ctrl *ctrl)
>>   {
>> +    if (!ctrl->tagset)
>> +        return;
>>       if (!test_and_set_bit(NVME_CTRL_STOPPED, &ctrl->flags))
>>           blk_mq_quiesce_tagset(ctrl->tagset);
>>       else
>> @@ -5169,6 +5171,8 @@ EXPORT_SYMBOL_GPL(nvme_stop_queues);
>>   void nvme_start_queues(struct nvme_ctrl *ctrl)
>>   {
>> +    if (!ctrl->tagset)
>> +        return;
>>       if (test_and_clear_bit(NVME_CTRL_STOPPED, &ctrl->flags))
>>           blk_mq_unquiesce_tagset(ctrl->tagset);
>>   }
> 
> Can we do that in the pci driver and not here?
> 
> .

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ