lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8ae98c9c-9cea-4c31-b888-9e3fcda42d86@amd.com>
Date: Mon, 19 Jan 2026 11:03:00 +0100
From: Christian König <christian.koenig@....com>
To: Zack Rusin <zack.rusin@...adcom.com>,
 Thomas Zimmermann <tzimmermann@...e.de>
Cc: dri-devel@...ts.freedesktop.org, Alex Deucher
 <alexander.deucher@....com>, amd-gfx@...ts.freedesktop.org,
 Ard Biesheuvel <ardb@...nel.org>, Ce Sun <cesun102@....com>,
 Chia-I Wu <olvaffe@...il.com>, Danilo Krummrich <dakr@...nel.org>,
 Dave Airlie <airlied@...hat.com>, Deepak Rawat <drawat.floss@...il.com>,
 Dmitry Osipenko <dmitry.osipenko@...labora.com>,
 Gerd Hoffmann <kraxel@...hat.com>,
 Gurchetan Singh <gurchetansingh@...omium.org>,
 Hans de Goede <hansg@...nel.org>, Hawking Zhang <Hawking.Zhang@....com>,
 Helge Deller <deller@....de>, intel-gfx@...ts.freedesktop.org,
 intel-xe@...ts.freedesktop.org, Jani Nikula <jani.nikula@...ux.intel.com>,
 Javier Martinez Canillas <javierm@...hat.com>,
 Jocelyn Falempe <jfalempe@...hat.com>,
 Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
 Lijo Lazar <lijo.lazar@....com>, linux-efi@...r.kernel.org,
 linux-fbdev@...r.kernel.org, linux-hyperv@...r.kernel.org,
 linux-kernel@...r.kernel.org, Lucas De Marchi <lucas.demarchi@...el.com>,
 Lyude Paul <lyude@...hat.com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 "Mario Limonciello (AMD)" <superm1@...nel.org>,
 Mario Limonciello <mario.limonciello@....com>,
 Maxime Ripard <mripard@...nel.org>, nouveau@...ts.freedesktop.org,
 Rodrigo Vivi <rodrigo.vivi@...el.com>, Simona Vetter <simona@...ll.ch>,
 spice-devel@...ts.freedesktop.org,
 Thomas Hellström <thomas.hellstrom@...ux.intel.com>,
 Timur Kristóf <timur.kristof@...il.com>,
 Tvrtko Ursulin <tursulin@...ulin.net>, virtualization@...ts.linux.dev,
 Vitaly Prosyak <vitaly.prosyak@....com>
Subject: Re: [PATCH 00/12] Recover sysfb after DRM probe failure

On 1/17/26 07:02, Zack Rusin wrote:
> On Fri, Jan 16, 2026 at 2:58 AM Thomas Zimmermann <tzimmermann@...e.de> wrote:
>>
>> Hi
>>
>> Am 16.01.26 um 04:59 schrieb Zack Rusin:
>>> On Thu, Jan 15, 2026 at 6:02 AM Thomas Zimmermann <tzimmermann@...e.de> wrote:
>>>> That's really not going to work. For example, in the current series, you
>>>> invoke devm_aperture_remove_conflicting_pci_devices_done() after
>>>> drm_mode_reset(), drm_dev_register() and drm_client_setup().
>>> That's perfectly fine,
>>> devm_aperture_remove_conflicting_pci_devices_done is removing the
>>> reload behavior not doing anything.
>>>
>>> This series, essentially, just adds a "defer" statement to
>>> aperture_remove_conflicting_pci_devices that says
>>>
>>> "reload sysfb if this driver unloads".
>>>
>>> devm_aperture_remove_conflicting_pci_devices_done just cancels that defer.
>>
>> Exactly. And if that reload happens after the hardware state has been
>> changed, the result is undefined.
> 
> This is all predicated on drivers actually cleaning up after
> themselves. I don't think any amount of good will or api design is
> going to fix device specific state mismatches.
> 
>> The current recovery/reload is not reliable in any case. A number of
>> high-profile devs have also said that it doesn't work with their driver.
>> The same is true for ast. So the current approach is not going to happen.
>>
>>> There also might be the case of some crazy behavior, e.g. pci bar
>>> resize in the driver makes the vga hardware crash or something, in
>>> which case, yea, we should definitely skip this patch, at least until
>>> those drivers properly cleanup on exit.
>>
>> There's nothing crazy here. It's standard probing code.
>>
>> If you want to to move forward, my suggestion is to look at the proposal
>> with the aperture_funcs callbacks that control sysfb device access. And
>> from there, build a full prototype with one or two drivers.
> 
> I don't think that approach is going to work. I don't think there's
> anything that can be done if drivers didn't cleanup everything they've
> done that might have broken sysfb on unload. I'm going to drop it
> then, it's obviously a shame because it works fine with virtualized
> drivers and they're ones that would likely profit from this the most
> but I'm sceptical that I could do full system state set reset in a
> generalized fashion for hw drivers or that the work required would be
> worth the payoff.

Well at least for PCI devices you could try doing a function level reset to get the HW back into some usable state.

This does *not* work for AMD HW since we have HW/FW bugs, but at least for your virtualized use case it might work.

All you need then is an EFI, Vesa or int10 call to re-init the HW to the pre-driver load setup.

I know that is not the easiest thing to do, but still better than a black screen.

Regards,
Christian.

> 
> z


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ