[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1c8bfa5-8ba4-c365-1663-535f656ca353@suse.de>
Date: Mon, 5 Dec 2022 11:11:17 +0100
From: Thomas Zimmermann <tzimmermann@...e.de>
To: "mb@....how" <mb@....how>
Cc: kvm@...r.kernel.org, airlied@...ux.ie,
linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
Alex Williamson <alex.williamson@...hat.com>,
kraxel@...hat.com, lersek@...hat.com
Subject: Re: [PATCH 2/2] vfio/pci: Remove console drivers
Hi
Am 05.12.22 um 10:32 schrieb mb@....how:
> I have a rtx 3070 and a 3090, I am absolutely sure I am binding vfio-pci
> to the 3090 and not the 3070.
>
> I have bound the driver in two different ways, first by passing the IDs
> to the module and alternatively by manipulating the system interface and
> use the override (this is what I originally had to do when I used two
> 1080s, so I know it works).
>
> While the 3090 doesn't show a console, there's a remnant from the refund
> (and grub previously) there.
>
> The assessment Alex made previously, where
> aperture_remove_conflicting_pci_devices() is removing the driver (EFIFB)
> instead of the device seems correct, but it could also can be a quirky
> of how EFIFB is implemented. I recall reading a long time ago that EFIFB
> is a special device and once it detects changes it would simply give up.
> There was also no way to attach a device to it again as it depends on
> being preloaded outside the kernel; once something takes over the buffer
> reinitializing is "impossible". I never went deeper to try and
> understand it.
We recently reworked fbdev's interaction with the aperture helpers. [1]
All devices should now be removed iff the driver has been bound to it
(which should be the case here) The patches went into an v6.1-rc.
Could you try the most recent v6.1-rc and report if this fixes the problem?
Best regards
Thomas
[1] https://patchwork.freedesktop.org/series/106040/
>
>
> On Mon, Dec 5, 2022, 2:00 AM Thomas Zimmermann <tzimmermann@...e.de
> <mailto:tzimmermann@...e.de>> wrote:
>
> Hi
>
> Am 05.12.22 um 01:51 schrieb Alex Williamson:
> > On Sat, 3 Dec 2022 17:12:38 -0700
> > "mb@....how" <mb@....how> wrote:
> >
> >> Hi,
> >>
> >> I hope it is ok to reply to this old thread.
> >
> > It is, but the only relic of the thread is the subject. For
> reference,
> > the latest version of this posted is here:
> >
> >
> https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@suse.de/ <https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@suse.de/>
> >
> > Which is committed as:
> >
> > d17378062079 ("vfio/pci: Remove console drivers")
> >
> >> Unfortunately, I found a
> >> problem only now after upgrading to 6.0.
> >>
> >> My setup has multiple GPUs (2), and I depend on EFIFB to have a
> working console.
>
> Which GPUs do you have?
>
> >> pre-patch behavior, when I bind the vfio-pci to my secondary GPU
> both
> >> the passthrough and the EFIFB keep working fine.
> >> post-patch behavior, when I bind the vfio-pci to the secondary GPU,
> >> the EFIFB disappears from the system, binding the console to the
> >> "dummy console".
>
> The efifb would likely use the first GPU. And vfio-pci should only
> remove the generic driver from the second device. Are you sure that
> you're not somehow using the first GPU with vfio-pci.
>
> >> Whenever you try to access the terminal, you have the screen
> stuck in
> >> whatever was the last buffer content, which gives the impression of
> >> "freezing," but I can still type.
> >> Everything else works, including the passthrough.
> >
> > This sounds like the call to
> aperture_remove_conflicting_pci_devices()
> > is removing the conflicting driver itself rather than removing the
> > device from the driver. Is it not possible to unbind the GPU from
> > efifb before binding the GPU to vfio-pci to effectively nullify the
> > added call?
> >
> >> I can only think about a few options:
> >>
> >> - Is there a way to have EFIFB show up again? After all it looks
> like
> >> the kernel has just abandoned it, but the buffer is still there. I
> >> can't find a single message about the secondary card and EFIFB in
> >> dmesg, but there's a message for the primary card and EFIFB.
> >> - Can we have a boolean controlling the behavior of vfio-pci
> >> altogether or at least controlling the behavior of vfio-pci for that
> >> specific ID? I know there's already some option for vfio-pci and VGA
> >> cards, would it be appropriate to attach this behavior to that
> option?
> >
> > I suppose we could have an opt-out module option on vfio-pci to skip
> > the above call, but clearly it would be better if things worked by
> > default. We cannot make full use of GPUs with vfio-pci if they're
> > still in use by host console drivers. The intention was certainly to
> > unbind the device from any low level drivers rather than disable
> use of
> > a console driver entirely. DRM/GPU folks, is that possibly an
> > interface we could implement? Thanks,
>
> When vfio-pci gives the GPU device to the guest, which driver driver is
> bound to it?
>
> Best regards
> Thomas
>
> >
> > Alex
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Ivo Totev
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (841 bytes)
Powered by blists - more mailing lists