[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7e75e292-3073-f1f9-287a-badf92c8e4e8@grimberg.me>
Date: Sat, 25 Sep 2021 23:34:31 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Keith Busch <kbusch@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Aditya Garg <gargaditya08@...e.com>, "axboe@...com" <axboe@...com>,
"hch@....de" <hch@....de>,
"linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
"james.smart@...adcom.com" <james.smart@...adcom.com>,
"chaitanya.kulkarni@....com" <chaitanya.kulkarni@....com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"trivial@...nel.org" <trivial@...nel.org>
Subject: Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
>>> From: Aditya Garg <gargaditya08@...e.com>
>>> Date: Fri, 24 Sep 2021 15:36:45 +0530
>>> Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
>>> Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)
>>
>> I think we need to hear more about the problem than just revert a
>> commit like this randomly. That commit has already been picked up for
>> -stable,
>>
>> What are the exact symptoms, and which Apple SSD is this?
>>
>> I do find this:
>>
>> https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/
>>
>> which instead of a revert has an actual patch. Can you try that one?
>>
>> Keith Busch replied to that one, saying that the Apple SSD might not
>> be spec compliant, but hey, what else is new? If we start demanding
>> that hardware comply with specs, we'd have to scrap the whole notion
>> of working in the real world. Plus it would be very hypocritical of
>> us, since we ignore all specs when we deem them too limiting (whether
>> they be language specs, POSIX OS specs, or whatever).
>
> Right, we have a lot of quirks for the apple controllers, what's one
> more? :)
>
> Could the following patch be tried? I'm basing this off the 'lspci' from
> Orlando, but I'm assuming the previous model has the same limitation,
> too.
>
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 7efb31b87f37..f0787233557f 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -979,6 +979,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
> blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
> {
> struct nvme_command *cmd = nvme_req(req)->cmd;
> + struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
> blk_status_t ret = BLK_STS_OK;
>
> if (!(req->rq_flags & RQF_DONTPREP)) {
> @@ -1027,7 +1028,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
> return BLK_STS_IOERR;
> }
>
> - nvme_req(req)->genctr++;
> + if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
> + nvme_req(req)->genctr++;
> cmd->common.command_id = nvme_cid(req);
> trace_nvme_setup_cmd(req, cmd);
> return ret;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 9871c0c9374c..b49761d30df7 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -86,6 +86,12 @@ enum nvme_quirks {
> */
> NVME_QUIRK_NO_DEEPEST_PS = (1 << 5),
>
> + /*
> + * The controller requires the command_id value be be limited to the
> + * queue depth.
> + */
> + NVME_QUIRK_SKIP_CID_GEN = (1 << 6),
> +
> /*
> * Set MEDIUM priority on SQ creation
> */
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index b82492cd7503..d9f22ed68185 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3369,7 +3369,10 @@ static const struct pci_device_id nvme_id_table[] = {
> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
> .driver_data = NVME_QUIRK_SINGLE_VECTOR |
> NVME_QUIRK_128_BYTES_SQES |
> - NVME_QUIRK_SHARED_TAGS },
> + NVME_QUIRK_SHARED_TAGS ,
> + NVME_QUIRK_SKIP_CID_GEN },
> + { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2006),
> + .driver_data = NVME_QUIRK_SKIP_CID_GEN },
>
> { PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
> { 0, }
> --
>
Exactly the patch that I was going to propose. The only downside
is that we need to access the controller in the hot-path...
Powered by blists - more mailing lists