lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEdEoBaixaTEuNfQGv1das7TwHKV9MiRMKQM0kLspveJmipzyg@mail.gmail.com>
Date:   Mon, 5 Dec 2022 14:50:52 -0700
From:   "mb@....how" <mb@....how>
To:     Thomas Zimmermann <tzimmermann@...e.de>
Cc:     kvm@...r.kernel.org, airlied@...ux.ie,
        linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        Alex Williamson <alex.williamson@...hat.com>,
        kraxel@...hat.com, lersek@...hat.com
Subject: Re: [PATCH 2/2] vfio/pci: Remove console drivers

Hi Thomas,

On Mon, Dec 5, 2022 at 3:11 AM Thomas Zimmermann <tzimmermann@...e.de> wrote:
>
> Hi
>
> Am 05.12.22 um 10:32 schrieb mb@....how:
> > I have a rtx 3070 and a 3090, I am absolutely sure I am binding vfio-pci
> > to the 3090 and not the 3070.
> >
> > I have bound the driver in two different ways, first by passing the IDs
> > to the module and alternatively by manipulating the system interface and
> > use the override (this is what I originally had to do when I used two
> > 1080s, so I know it works).
> >
> > While the 3090 doesn't show a console, there's a remnant from the refund
> > (and grub previously) there.
> >
> > The assessment Alex made previously, where
> > aperture_remove_conflicting_pci_devices() is removing the driver (EFIFB)
> > instead of the device seems correct, but it could also can be a quirky
> > of how EFIFB is implemented. I recall reading a long time ago that EFIFB
> > is a special device and once it detects changes it would simply give up.
> > There was also no way to attach a device to it again as it depends on
> > being preloaded outside the kernel; once something takes over the buffer
> > reinitializing is "impossible". I never went deeper to try and
> > understand it.
>
> We recently reworked fbdev's interaction with the aperture helpers. [1]
> All devices should now be removed iff the driver has been bound to it
> (which should be the case here) The patches went into an v6.1-rc.
>
> Could you try the most recent v6.1-rc and report if this fixes the problem?

I just tried the latest one, v6.1-rc8, and I can see all the commits
for the series you mentioned there.

The same freeze behavior happens when I load vfio-pci:

[ 6.525463] VFIO - User Level meta-driver version: 0.3
[ 6.528231] Console: switching to colour dummy device 320x90

--
Carlos

>
> Best regards
> Thomas
>
> [1] https://patchwork.freedesktop.org/series/106040/
>
> >
> >
> > On Mon, Dec 5, 2022, 2:00 AM Thomas Zimmermann <tzimmermann@...e.de
> > <mailto:tzimmermann@...e.de>> wrote:
> >
> >     Hi
> >
> >     Am 05.12.22 um 01:51 schrieb Alex Williamson:
> >      > On Sat, 3 Dec 2022 17:12:38 -0700
> >      > "mb@....how" <mb@....how> wrote:
> >      >
> >      >> Hi,
> >      >>
> >      >> I hope it is ok to reply to this old thread.
> >      >
> >      > It is, but the only relic of the thread is the subject.  For
> >     reference,
> >      > the latest version of this posted is here:
> >      >
> >      >
> >     https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@suse.de/ <https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@suse.de/>
> >      >
> >      > Which is committed as:
> >      >
> >      > d17378062079 ("vfio/pci: Remove console drivers")
> >      >
> >      >> Unfortunately, I found a
> >      >> problem only now after upgrading to 6.0.
> >      >>
> >      >> My setup has multiple GPUs (2), and I depend on EFIFB to have a
> >     working console.
> >
> >     Which GPUs do you have?
> >
> >      >> pre-patch behavior, when I bind the vfio-pci to my secondary GPU
> >     both
> >      >> the passthrough and the EFIFB keep working fine.
> >      >> post-patch behavior, when I bind the vfio-pci to the secondary GPU,
> >      >> the EFIFB disappears from the system, binding the console to the
> >      >> "dummy console".
> >
> >     The efifb would likely use the first GPU. And vfio-pci should only
> >     remove the generic driver from the second device. Are you sure that
> >     you're not somehow using the first GPU with vfio-pci.
> >
> >      >> Whenever you try to access the terminal, you have the screen
> >     stuck in
> >      >> whatever was the last buffer content, which gives the impression of
> >      >> "freezing," but I can still type.
> >      >> Everything else works, including the passthrough.
> >      >
> >      > This sounds like the call to
> >     aperture_remove_conflicting_pci_devices()
> >      > is removing the conflicting driver itself rather than removing the
> >      > device from the driver.  Is it not possible to unbind the GPU from
> >      > efifb before binding the GPU to vfio-pci to effectively nullify the
> >      > added call?
> >      >
> >      >> I can only think about a few options:
> >      >>
> >      >> - Is there a way to have EFIFB show up again? After all it looks
> >     like
> >      >> the kernel has just abandoned it, but the buffer is still there. I
> >      >> can't find a single message about the secondary card and EFIFB in
> >      >> dmesg, but there's a message for the primary card and EFIFB.
> >      >> - Can we have a boolean controlling the behavior of vfio-pci
> >      >> altogether or at least controlling the behavior of vfio-pci for that
> >      >> specific ID? I know there's already some option for vfio-pci and VGA
> >      >> cards, would it be appropriate to attach this behavior to that
> >     option?
> >      >
> >      > I suppose we could have an opt-out module option on vfio-pci to skip
> >      > the above call, but clearly it would be better if things worked by
> >      > default.  We cannot make full use of GPUs with vfio-pci if they're
> >      > still in use by host console drivers.  The intention was certainly to
> >      > unbind the device from any low level drivers rather than disable
> >     use of
> >      > a console driver entirely.  DRM/GPU folks, is that possibly an
> >      > interface we could implement?  Thanks,
> >
> >     When vfio-pci gives the GPU device to the guest, which driver driver is
> >     bound to it?
> >
> >     Best regards
> >     Thomas
> >
> >      >
> >      > Alex
> >      >
> >
> >     --
> >     Thomas Zimmermann
> >     Graphics Driver Developer
> >     SUSE Software Solutions Germany GmbH
> >     Maxfeldstr. 5, 90409 Nürnberg, Germany
> >     (HRB 36809, AG Nürnberg)
> >     Geschäftsführer: Ivo Totev
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Ivo Totev

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ