lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1c8bfa5-8ba4-c365-1663-535f656ca353@suse.de>
Date:   Mon, 5 Dec 2022 11:11:17 +0100
From:   Thomas Zimmermann <tzimmermann@...e.de>
To:     "mb@....how" <mb@....how>
Cc:     kvm@...r.kernel.org, airlied@...ux.ie,
        linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        Alex Williamson <alex.williamson@...hat.com>,
        kraxel@...hat.com, lersek@...hat.com
Subject: Re: [PATCH 2/2] vfio/pci: Remove console drivers

Hi

Am 05.12.22 um 10:32 schrieb mb@....how:
> I have a rtx 3070 and a 3090, I am absolutely sure I am binding vfio-pci 
> to the 3090 and not the 3070.
> 
> I have bound the driver in two different ways, first by passing the IDs 
> to the module and alternatively by manipulating the system interface and 
> use the override (this is what I originally had to do when I used two 
> 1080s, so I know it works).
> 
> While the 3090 doesn't show a console, there's a remnant from the refund 
> (and grub previously) there.
> 
> The assessment Alex made previously, where 
> aperture_remove_conflicting_pci_devices() is removing the driver (EFIFB) 
> instead of the device seems correct, but it could also can be a quirky 
> of how EFIFB is implemented. I recall reading a long time ago that EFIFB 
> is a special device and once it detects changes it would simply give up. 
> There was also no way to attach a device to it again as it depends on 
> being preloaded outside the kernel; once something takes over the buffer 
> reinitializing is "impossible". I never went deeper to try and 
> understand it.

We recently reworked fbdev's interaction with the aperture helpers. [1] 
All devices should now be removed iff the driver has been bound to it 
(which should be the case here) The patches went into an v6.1-rc.

Could you try the most recent v6.1-rc and report if this fixes the problem?

Best regards
Thomas

[1] https://patchwork.freedesktop.org/series/106040/

> 
> 
> On Mon, Dec 5, 2022, 2:00 AM Thomas Zimmermann <tzimmermann@...e.de 
> <mailto:tzimmermann@...e.de>> wrote:
> 
>     Hi
> 
>     Am 05.12.22 um 01:51 schrieb Alex Williamson:
>      > On Sat, 3 Dec 2022 17:12:38 -0700
>      > "mb@....how" <mb@....how> wrote:
>      >
>      >> Hi,
>      >>
>      >> I hope it is ok to reply to this old thread.
>      >
>      > It is, but the only relic of the thread is the subject.  For
>     reference,
>      > the latest version of this posted is here:
>      >
>      >
>     https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@suse.de/ <https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@suse.de/>
>      >
>      > Which is committed as:
>      >
>      > d17378062079 ("vfio/pci: Remove console drivers")
>      >
>      >> Unfortunately, I found a
>      >> problem only now after upgrading to 6.0.
>      >>
>      >> My setup has multiple GPUs (2), and I depend on EFIFB to have a
>     working console.
> 
>     Which GPUs do you have?
> 
>      >> pre-patch behavior, when I bind the vfio-pci to my secondary GPU
>     both
>      >> the passthrough and the EFIFB keep working fine.
>      >> post-patch behavior, when I bind the vfio-pci to the secondary GPU,
>      >> the EFIFB disappears from the system, binding the console to the
>      >> "dummy console".
> 
>     The efifb would likely use the first GPU. And vfio-pci should only
>     remove the generic driver from the second device. Are you sure that
>     you're not somehow using the first GPU with vfio-pci.
> 
>      >> Whenever you try to access the terminal, you have the screen
>     stuck in
>      >> whatever was the last buffer content, which gives the impression of
>      >> "freezing," but I can still type.
>      >> Everything else works, including the passthrough.
>      >
>      > This sounds like the call to
>     aperture_remove_conflicting_pci_devices()
>      > is removing the conflicting driver itself rather than removing the
>      > device from the driver.  Is it not possible to unbind the GPU from
>      > efifb before binding the GPU to vfio-pci to effectively nullify the
>      > added call?
>      >
>      >> I can only think about a few options:
>      >>
>      >> - Is there a way to have EFIFB show up again? After all it looks
>     like
>      >> the kernel has just abandoned it, but the buffer is still there. I
>      >> can't find a single message about the secondary card and EFIFB in
>      >> dmesg, but there's a message for the primary card and EFIFB.
>      >> - Can we have a boolean controlling the behavior of vfio-pci
>      >> altogether or at least controlling the behavior of vfio-pci for that
>      >> specific ID? I know there's already some option for vfio-pci and VGA
>      >> cards, would it be appropriate to attach this behavior to that
>     option?
>      >
>      > I suppose we could have an opt-out module option on vfio-pci to skip
>      > the above call, but clearly it would be better if things worked by
>      > default.  We cannot make full use of GPUs with vfio-pci if they're
>      > still in use by host console drivers.  The intention was certainly to
>      > unbind the device from any low level drivers rather than disable
>     use of
>      > a console driver entirely.  DRM/GPU folks, is that possibly an
>      > interface we could implement?  Thanks,
> 
>     When vfio-pci gives the GPU device to the guest, which driver driver is
>     bound to it?
> 
>     Best regards
>     Thomas
> 
>      >
>      > Alex
>      >
> 
>     -- 
>     Thomas Zimmermann
>     Graphics Driver Developer
>     SUSE Software Solutions Germany GmbH
>     Maxfeldstr. 5, 90409 Nürnberg, Germany
>     (HRB 36809, AG Nürnberg)
>     Geschäftsführer: Ivo Totev
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (841 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ