lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02789f9b-ff16-b419-097f-b97b56afad57@igalia.com>
Date:   Thu, 29 Jun 2023 10:11:06 -0300
From:   André Almeida <andrealmeid@...lia.com>
To:     Christian König <ckoenig.leichtzumerken@...il.com>
Cc:     pierre-eric.pelloux-prayer@....com,
        Randy Dunlap <rdunlap@...radead.org>,
        Daniel Vetter <daniel@...ll.ch>,
        'Marek Olšák' <maraeo@...il.com>,
        Michel Dänzer <michel.daenzer@...lbox.org>,
        Simon Ser <contact@...rsion.fr>, linux-kernel@...r.kernel.org,
        dri-devel@...ts.freedesktop.org,
        Timur Kristóf <timur.kristof@...il.com>,
        amd-gfx@...ts.freedesktop.org,
        Pekka Paalanen <ppaalanen@...il.com>,
        Daniel Stone <daniel@...ishbar.org>,
        Rob Clark <robdclark@...il.com>,
        Samuel Pitoiset <samuel.pitoiset@...il.com>,
        kernel-dev@...lia.com, Bas Nieuwenhuizen <bas@...nieuwenhuizen.nl>,
        alexander.deucher@....com,
        Pekka Paalanen <pekka.paalanen@...labora.com>,
        Dave Airlie <airlied@...il.com>, christian.koenig@....com
Subject: Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations



Em 27/06/2023 18:17, André Almeida escreveu:
> Em 27/06/2023 14:47, Christian König escreveu:
>> Am 27.06.23 um 15:23 schrieb André Almeida:
>>> Create a section that specifies how to deal with DRM device resets for
>>> kernel and userspace drivers.
>>>
>>> Acked-by: Pekka Paalanen <pekka.paalanen@...labora.com>
>>> Signed-off-by: André Almeida <andrealmeid@...lia.com>
>>> ---
>>>
>>> v4: 
>>> https://lore.kernel.org/lkml/20230626183347.55118-1-andrealmeid@igalia.com/
>>>
>>> Changes:
>>>   - Grammar fixes (Randy)
>>>
>>>   Documentation/gpu/drm-uapi.rst | 68 ++++++++++++++++++++++++++++++++++
>>>   1 file changed, 68 insertions(+)
>>>
>>> diff --git a/Documentation/gpu/drm-uapi.rst 
>>> b/Documentation/gpu/drm-uapi.rst
>>> index 65fb3036a580..3cbffa25ed93 100644
>>> --- a/Documentation/gpu/drm-uapi.rst
>>> +++ b/Documentation/gpu/drm-uapi.rst
>>> @@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a 
>>> third handler for
>>>   mmapped regular files. Threads cause additional pain with signal
>>>   handling as well.
>>> +Device reset
>>> +============
>>> +
>>> +The GPU stack is really complex and is prone to errors, from 
>>> hardware bugs,
>>> +faulty applications and everything in between the many layers. Some 
>>> errors
>>> +require resetting the device in order to make the device usable 
>>> again. This
>>> +sections describes the expectations for DRM and usermode drivers when a
>>> +device resets and how to propagate the reset status.
>>> +
>>> +Kernel Mode Driver
>>> +------------------
>>> +
>>> +The KMD is responsible for checking if the device needs a reset, and 
>>> to perform
>>> +it as needed. Usually a hang is detected when a job gets stuck 
>>> executing. KMD
>>> +should keep track of resets, because userspace can query any time 
>>> about the
>>> +reset stats for an specific context.
>>
>> Maybe drop the part "for a specific context". Essentially the reset 
>> query could use global counters instead and we won't need the context 
>> any more here.
>>
> 
> Right, I wrote like this to reflect how it's currently implemented.
> 
> If follow correctly what you meant, KMD could always notify the global 
> count for UMD, and we would move to the UMD the responsibility to manage 
> the reset counters, right? This would also simplify my 
> DRM_IOCTL_GET_RESET proposal. I'll apply your suggestion to the next doc 
> version.
> 

Actually, if we drop the context identifier we would lose the ability to 
track which is the guilty context. Vulkan API doesn't seem to care about 
this, but OpenGL does.

>> Apart from that this sounds good to me, feel free to add my rb.
>>
>> Regards,
>> Christian.
>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ