lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f5bf590a-5d3f-03f2-531c-057cf8760000@amd.com>
Date:   Tue, 25 Apr 2023 14:44:23 +0200
From:   Christian König <christian.koenig@....com>
To:     Michel Dänzer <michel.daenzer@...lbox.org>,
        Marek Olšák <maraeo@...il.com>
Cc:     Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>,
        André Almeida <andrealmeid@...lia.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        "Tuikov, Luben" <Luben.Tuikov@....com>,
        amd-gfx mailing list <amd-gfx@...ts.freedesktop.org>,
        kernel-dev@...lia.com,
        "Deucher, Alexander" <alexander.deucher@....com>
Subject: Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

Am 25.04.23 um 14:14 schrieb Michel Dänzer:
> On 4/25/23 14:08, Christian König wrote:
>> Well signaling that something happened is not the question. We do this for both soft as well as hard resets.
>>
>> The question is if errors result in blocking further submissions with the same context or not.
>>
>> In case of a hard reset and potential loss of state we have to kill the context, otherwise a follow up submission would just lockup the hardware once more.
>>
>> In case of a soft reset I think we can keep the context alive, this way even applications without robustness handling can keep work.
>>
>> You potentially still get some corruption, but at least not your compositor killed.
> Right, and if there is corruption, the user can restart the session.
>
>
> Maybe a possible compromise could be making soft resets fatal if user space enabled robustness for the context, and non-fatal if not.

Well that should already be mostly the case. If an application has 
enabled robustness it should notice that something went wrong and act 
appropriately.

The only thing we need to handle is for applications without robustness 
in case of a hard reset or otherwise it will trigger an reset over and 
over again.

Christian.

>
>
>> Am 25.04.23 um 13:07 schrieb Marek Olšák:
>>> That supposedly depends on the compositor. There may be compositors for very specific cases (e.g. Steam Deck) that handle resets very well, and those would like to be properly notified of all resets because that's how they get the best outcome, e.g. no corruption. A soft reset that is unhandled by userspace may result in persistent corruption.
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ