[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87b925cb-6393-ca9b-6549-b3ba85ad54fc@redhat.com>
Date: Tue, 11 Jul 2023 18:07:46 -0400
From: John Meneghini <jmeneghi@...hat.com>
To: Keith Busch <kbusch@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christoph Hellwig <hch@....de>,
Linux regressions mailing list <regressions@...ts.linux.dev>,
Pankaj Raghav <p.raghav@...sung.com>,
Bagas Sanjaya <bagasdotme@...il.com>,
Jens Axboe <axboe@...nel.dk>, Sagi Grimberg <sagi@...mberg.me>,
"Clemens S." <cspringsguth@...il.com>,
Martin Belanger <martin.belanger@...l.com>,
Chaitanya Kulkarni <kch@...dia.com>,
Hannes Reinecke <hare@...e.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux NVMe <linux-nvme@...ts.infradead.org>,
Kanchan Joshi <joshi.k@...sung.com>,
Javier Gonzalez <javier.gonz@...sung.com>,
박진환 <jh.i.park@...sung.com>
Subject: Re: Fwd: Need NVME QUIRK BOGUS for SAMSUNG MZ1WV480HCGL-000MV
(Samsung SM-953 Datacenter SSD)
Yes, this is what I thought. This is all the result of the duplicate NID check added to deal with TP4034 Dispersed Namespaces.
One suggestion I have would be to limit this check to nvme-of subsystems only. These are the only devices I am aware of out
there which support TP4034. Moreover, all nvme-of devices report a valid NID. It's required with NVMe Over Fabrics. The PCIe
devices, I expect, don't care. You don't really need a valid NID with a private namespace - which is what most PCIe devices are.
I'll wager that if you change nvme_global_check_duplicate_ids() to check only nvme-of subsystems, and simply continue with PCIe
subsystems, 90% of these nvme quirks can be removed.
John Meneghini
Senior Principal Platform Storage Engineer
RHEL SST - Platform Storage Group
jmeneghi@...hat.com
On 7/11/23 13:21, Keith Busch wrote:
> On Tue, Jul 11, 2023 at 09:47:00AM -0700, Linus Torvalds wrote:
>> On Tue, 11 Jul 2023 at 05:06, Christoph Hellwig <hch@....de> wrote:
>> For example, we have this completely unacceptable garbage:
>>
>> ret = nvme_global_check_duplicate_ids(ctrl->subsys, &info->ids);
>> if (ret) {
>> dev_err(ctrl->device,
>> "globally duplicate IDs for nsid %d\n", info->nsid);
>> nvme_print_device_info(ctrl);
>> return ret;
>> }
>>
>> iow, the code even checks for and *notices* that there are duplicate
>> IDs, and what does it do? It then errors out.
>
> This check came from a recent half-baked spec feature called "Dispersed
> Namespaces" that caused breakage and data corruption when used in Linux.
> Rather than attempt to support that mostly vendor specific feature, the
> driver attempted to fence that off as unmaintainable. This check wasn't
> aimed at enforcing "correctness", but it certainly found a lot of that
> as collatoral damage. Let's see if we can find a better way to detect
> the difference with a sane fallback as you suggest.
>
Powered by blists - more mailing lists