lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210925203857.GC116968@dhcp-10-100-145-180.wdc.com>
Date:   Sat, 25 Sep 2021 13:38:57 -0700
From:   Keith Busch <kbusch@...nel.org>
To:     Sven Peter <sven@...npeter.dev>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Aditya Garg <gargaditya08@...e.com>,
        "axboe@...com" <axboe@...com>, "hch@....de" <hch@....de>,
        "sagi@...mberg.me" <sagi@...mberg.me>,
        "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
        "james.smart@...adcom.com" <james.smart@...adcom.com>,
        "chaitanya.kulkarni@....com" <chaitanya.kulkarni@....com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "trivial@...nel.org" <trivial@...nel.org>
Subject: Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.

On Sat, Sep 25, 2021 at 10:08:53PM +0200, Sven Peter wrote:
> I actually ran into a similar issue while adding support for the NVMe
> controller found on the M1 and assumed it was only present there.
> 
> Some background why this happens: ANS2 is a co-processor that emulates
> an NVMe MMIO interface and uses the tag as an index to an internal data
> structure. 

Thanks for confirming the behavior. The patch should restore the
command_id values to when everything was working. I'll just need to
update the quirk description to better align with the actual limitation
if the patch is successful. 

> On the M1 we can directly talk to ANS2 and while we can submit
> commands with a higher index it'll just ignore the upper bits and only
> return the lowest eight IIRC in the completion queue.
> I guess whatever software is running on the T2 actually has an assert to
> ensure that the tag is within the limits before forwarding the command
> to ANS2.

Is the PCI Device ID for the M1 the same as reported for the T2? Either
0x2005 or 0x2003 should make this quirk apply.

Side note, I mistakenly added an entry for device ID 0x2006, but that
was from me misreading the report.

> Best,
> 
> Sven
> 
> 
> On Sat, Sep 25, 2021, at 21:54, Keith Busch wrote:
> > On Sat, Sep 25, 2021 at 11:47:08AM -0700, Linus Torvalds wrote:
> >> On Fri, Sep 24, 2021 at 9:02 PM Aditya Garg <gargaditya08@...e.com> wrote:
> >> >
> >> > From: Aditya Garg <gargaditya08@...e.com>
> >> > Date: Fri, 24 Sep 2021 15:36:45 +0530
> >> > Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
> >> > Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)
> >> 
> >> I think we need to hear more about the problem than just revert a
> >> commit like this randomly. That commit has already been picked up for
> >> -stable,
> >> 
> >> What are the exact symptoms, and which Apple SSD is this?
> >> 
> >> I do find this:
> >> 
> >>   https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/
> >> 
> >> which instead of a revert has an actual patch. Can you try that one?
> >> 
> >> Keith Busch replied to that one, saying that the Apple SSD might not
> >> be spec compliant, but hey, what else is new? If we start demanding
> >> that hardware comply with specs, we'd have to scrap the whole notion
> >> of working in the real world. Plus it would be very hypocritical of
> >> us, since we ignore all specs when we deem them too limiting (whether
> >> they be language specs, POSIX OS specs, or whatever).
> >
> > Right, we have a lot of quirks for the apple controllers, what's one
> > more? :)
> >
> > Could the following patch be tried? I'm basing this off the 'lspci' from
> > Orlando, but I'm assuming the previous model has the same limitation,
> > too.
> >
> > ---
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 7efb31b87f37..f0787233557f 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -979,6 +979,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
> >  blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
> >  {
> >  	struct nvme_command *cmd = nvme_req(req)->cmd;
> > +	struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
> >  	blk_status_t ret = BLK_STS_OK;
> > 
> >  	if (!(req->rq_flags & RQF_DONTPREP)) {
> > @@ -1027,7 +1028,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, 
> > struct request *req)
> >  		return BLK_STS_IOERR;
> >  	}
> > 
> > -	nvme_req(req)->genctr++;
> > +	if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
> > +		nvme_req(req)->genctr++;
> >  	cmd->common.command_id = nvme_cid(req);
> >  	trace_nvme_setup_cmd(req, cmd);
> >  	return ret;
> > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> > index 9871c0c9374c..b49761d30df7 100644
> > --- a/drivers/nvme/host/nvme.h
> > +++ b/drivers/nvme/host/nvme.h
> > @@ -86,6 +86,12 @@ enum nvme_quirks {
> >  	 */
> >  	NVME_QUIRK_NO_DEEPEST_PS		= (1 << 5),
> > 
> > +	/*
> > +	 * The controller requires the command_id value be be limited to the
> > +	 * queue depth.
> > +	 */
> > +	NVME_QUIRK_SKIP_CID_GEN			= (1 << 6),
> > +
> >  	/*
> >  	 * Set MEDIUM priority on SQ creation
> >  	 */
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index b82492cd7503..d9f22ed68185 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -3369,7 +3369,10 @@ static const struct pci_device_id nvme_id_table[] = {
> >  	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
> >  		.driver_data = NVME_QUIRK_SINGLE_VECTOR |
> >  				NVME_QUIRK_128_BYTES_SQES |
> > -				NVME_QUIRK_SHARED_TAGS },
> > +				NVME_QUIRK_SHARED_TAGS ,
> > +				NVME_QUIRK_SKIP_CID_GEN },
> > +	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2006),
> > +		.driver_data = NVME_QUIRK_SKIP_CID_GEN },
> > 
> >  	{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
> >  	{ 0, }
> > --
> 
> -- 
> Sven Peter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ