[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250306000348.GA1233@lst.de>
Date: Thu, 6 Mar 2025 01:03:48 +0100
From: Christoph Hellwig <hch@....de>
To: Keith Busch <kbusch@...nel.org>
Cc: Christoph Hellwig <hch@....de>, Hannes Reinecke <hare@...e.de>,
Sagi Grimberg <sagi@...mberg.me>,
Nilay Shroff <nilay@...ux.ibm.com>,
John Meneghini <jmeneghi@...hat.com>, bmarzins@...hat.com,
Bryan Gurney <bgurney@...hat.com>, linux-nvme@...ts.infradead.org,
linux-kernel@...r.kernel.org, Marco Patalano <mpatalan@...hat.com>,
axboe@...nel.dk
Subject: Re: [PATCH] nvme: remove multipath module parameter
On Wed, Mar 05, 2025 at 04:57:44PM -0700, Keith Busch wrote:
> > > Obviously he's not talking about multiported PCIe.
> >
> > Why is that obvious?
>
> No one here would think a multiported device *wouldn't* report CMIC.
I hopes so.
> The
> fact Hannes thinks that's a questionable feature for his device gives
> away that it is single ported.
Well, his quote reads like he doesn't know about multiport PCIe devices.
But maybe he just meant to say "despite being single-ported"
> > At least based on the stated works he talks about
> > PCIe and not about multi-port. The only not multiported devices I've
> > seen that report NMIC and CMIC are a specific firmware so that the
> > customer would get multipath behavior, which is a great workaround for
> > instable heavily switched fabrics. Note that multiported isn't always
> > obvious as there are quite a few hacks using lane splitting around that
> > a normal host can't really see.
>
> In my experience, it's left enabled because of SRIOV, which many of
> these devices end up shipping without supporting in PCI space anyway.
If a device supports SR-IO setting CMIC and NMIC is corret, but I've
actually seen surprisingly few production controllers actually supporting
SR-IOV despite what the datasheets say.
>
> > > And he's right, the
> > > behavior of a PCIe hot plug is very different and often undesirable when
> > > it's under native multipath.
> >
> > If you do actual hotplug and expect the device to go away it's indeed
> > not desirable. If you want the same device to come back after switched
> > fabric issues it is so desirable that people hack to devices to get it.
> > People talked about adding a queue_if_no_path-like parameter to control
> > keeping the multipath node alive a lot, but no one has ever invested
> > work into actually implementing it.
>
> Not quite the same thing, but kind of related: I proposed this device
> missing debounce thing about a year ago:
>
> https://lore.kernel.org/linux-nvme/Y+1aKcQgbskA2tra@kbusch-mbp.dhcp.thefacebook.com/
Yes, that somehow fell off the cliff.
Powered by blists - more mailing lists