[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f084e3bc-77af-4268-c882-7b0737e45f3b@grimberg.me>
Date: Thu, 31 May 2018 11:37:20 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Mike Snitzer <snitzer@...hat.com>
Cc: Christoph Hellwig <hch@....de>,
Johannes Thumshirn <jthumshirn@...e.de>,
Keith Busch <keith.busch@...el.com>,
Hannes Reinecke <hare@...e.de>,
Laurence Oberman <loberman@...hat.com>,
Ewan Milne <emilne@...hat.com>,
James Smart <james.smart@...adcom.com>,
Linux Kernel Mailinglist <linux-kernel@...r.kernel.org>,
Linux NVMe Mailinglist <linux-nvme@...ts.infradead.org>,
"Martin K . Petersen" <martin.petersen@...cle.com>,
Martin George <marting@...app.com>,
John Meneghini <John.Meneghini@...app.com>
Subject: Re: [PATCH 0/3] Provide more fine grained control over multipathing
> Wouldn't expect you guys to nurture this 'mpath_personality' knob. SO
> when features like "dispersed namespaces" land a negative check would
> need to be added in the code to prevent switching from "native".
>
> And once something like "dispersed namespaces" lands we'd then have to
> see about a more sophisticated switch that operates at a different
> granularity. Could also be that switching one subsystem that is part of
> "dispersed namespaces" would then cascade to all other associated
> subsystems? Not that dissimilar from the 3rd patch in this series that
> allows a 'device' switch to be done in terms of the subsystem.
Which I think is broken by allowing to change this personality on the
fly.
>
> Anyway, I don't know the end from the beginning on something you just
> told me about ;) But we're all in this together. And can take it as it
> comes.
I agree but this will be exposed to user-space and we will need to live
with it for a long long time...
> I'm merely trying to bridge the gap from old dm-multipath while
> native NVMe multipath gets its legs.
>
> In time I really do have aspirations to contribute more to NVMe
> multipathing. I think Christoph's NVMe multipath implementation of
> bio-based device ontop on NVMe core's blk-mq device(s) is very clever
> and effective (blk_steal_bios() hack and all).
That's great.
>> Don't get me wrong, I do support your cause, and I think nvme should try
>> to help, I just think that subsystem granularity is not the correct
>> approach going forward.
>
> I understand there will be limits to this 'mpath_personality' knob's
> utility and it'll need to evolve over time. But the burden of making
> more advanced NVMe multipath features accessible outside of native NVMe
> isn't intended to be on any of the NVMe maintainers (other than maybe
> remembering to disallow the switch where it makes sense in the future).
I would expect that any "advanced multipath features" would be properly
brought up with the NVMe TWG as a ratified standard and find its way
to nvme. So I don't think this particularly is a valid argument.
>> As I said, I've been off the grid, can you remind me why global knob is
>> not sufficient?
>
> Because once nvme_core.multipath=N is set: native NVMe multipath is then
> not accessible from the same host. The goal of this patchset is to give
> users choice. But not limit them to _only_ using dm-multipath if they
> just have some legacy needs.
>
> Tough to be convincing with hypotheticals but I could imagine a very
> obvious usecase for native NVMe multipathing be PCI-based embedded NVMe
> "fabrics" (especially if/when the numa-based path selector lands). But
> the same host with PCI NVMe could be connected to a FC network that has
> historically always been managed via dm-multipath.. but say that
> FC-based infrastructure gets updated to use NVMe (to leverage a wider
> NVMe investment, whatever?) -- but maybe admins would still prefer to
> use dm-multipath for the NVMe over FC.
You are referring to an array exposing media via nvmf and scsi
simultaneously? I'm not sure that there is a clean definition of
how that is supposed to work (ANA/ALUA, reservations, etc..)
>> This might sound stupid to you, but can't users that desperately must
>> keep using dm-multipath (for its mature toolset or what-not) just
>> stack it on multipath nvme device? (I might be completely off on
>> this so feel free to correct my ignorance).
>
> We could certainly pursue adding multipath-tools support for native NVMe
> multipathing. Not opposed to it (even if just reporting topology and
> state). But given the extensive lengths NVMe multipath goes to hide
> devices we'd need some way to piercing through the opaque nvme device
> that native NVMe multipath exposes. But that really is a tangent
> relative to this patchset. Since that kind of visibility would also
> benefit the nvme cli... otherwise how are users to even be able to trust
> but verify native NVMe multipathing did what it expected it to?
Can you explain what is missing for multipath-tools to resolve topology?
nvme list-subsys is doing just that, doesn't it? It lists subsys-ctrl
topology but that is sort of the important information as controllers
are the real paths.
Powered by blists - more mailing lists