[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <71b42e5b613628642abeba5bd1e61089ca59c643.camel@suse.com>
Date: Thu, 15 May 2025 16:50:38 +0200
From: Martin Wilck <mwilck@...e.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: Benjamin Marzinski <bmarzins@...hat.com>, Christoph Hellwig
<hch@...radead.org>, Kevin Wolf <kwolf@...hat.com>,
dm-devel@...ts.linux.dev, Hanna Czenczek <hreitz@...hat.com>, Mikulas
Patocka <mpatocka@...hat.com>, snitzer@...nel.org, "Kernel Mailing List,
Linux" <linux-kernel@...r.kernel.org>, Hannes Reinecke <hare@...e.com>
Subject: Re: [PATCH 0/2] dm mpath: Interface for explicit probing of active
paths
On Thu, 2025-05-15 at 12:51 +0200, Paolo Bonzini wrote:
> On Thu, May 15, 2025 at 12:34 PM Martin Wilck <mwilck@...e.com>
> >
> > Thanks for mentioning this. However, I suppose that depends on the
> > permissions with which the qemu process is started, no? Wouldn't
> > qemu need CAP_SYS_RAWIO for PCI passthrough as well?
>
> Generally you want to assume that the VM is hostile and run QEMU with
> as few privileges as possible (not just capabilities, but also in
> separate namespaces and with restrictions from device cgroups,
> SELinux, etc.). PCI passthrough is not an issue, it only needs access
> to the VFIO inodes and you can do it by setting the appropriate file
> permissions without extra capabilities. The actual privileged part is
> binding the device to VFIO, which is done outside QEMU anyway.
Thanks for the clarification.
> > I admit that I'm confused by the many indirections in qemu's scsi-
> > block
> > code flow. AFAICS qemu forwards everything except PRIN/PROUT to the
> > kernel block device in "scsi-block" mode. Correct me if I'm wrong.
>
> Yes, that's correct. The code for PRIN/PROUT calls out to a separate
> privileged process (in scsi/qemu-pr-helper.c if you're curious) which
> is aware of multipath and can be extended if needed.
Sure, I was aware of the helper. I just wasn't 100% clear about how it
gets called. Found the code in the meantime [1].
[1] https://github.com/qemu/qemu/blob/864813878951b44e964eb4c012d832fd21f8cc0c/block/file-posix.c#L4286
> > > .Of the ones that aren't simple I/O, mode parameters and TUR are
> > > the
> > > important cases. A TUR failure would be handled by the ioctl that
> > > Kevin proposed here by forcing a path switch. Mode parameters
> > > might
> > > not be shared(*) and would need to be sent down all the paths in
> > > that
> > > case; that can be fixed in userspace if necessary.
> >
> > Passing TUR from a multipath device to a random member doesn't make
> > much sense to me. qemu would need to implement some logic to
> > determine
> > whether the map has any usable paths.
>
> As long as one path replies to a TUR and the host is able to
> (eventually, somehow) steer I/O transparently to that path, that
> should be good enough. If the one path that the kernel tries is down,
> QEMU can probe which paths are up and retry. That seems consistent
> with what you want from TUR but maybe I'm missing something.
It's ok-ish, in particular in combination with Kevin't patch. But using
an equivalent of "multipath -C" would be closer to the real thing for
TUR.
Regards
Martin
Powered by blists - more mailing lists