lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230501185747.33519-1-andrealmeid@igalia.com>
Date:   Mon,  1 May 2023 15:57:46 -0300
From:   André Almeida <andrealmeid@...lia.com>
To:     dri-devel@...ts.freedesktop.org, amd-gfx@...ts.freedesktop.org,
        linux-kernel@...r.kernel.org
Cc:     kernel-dev@...lia.com, alexander.deucher@....com,
        christian.koenig@....com, pierre-eric.pelloux-prayer@....com,
        'Marek Olšák' <maraeo@...il.com>,
        Samuel Pitoiset <samuel.pitoiset@...il.com>,
        Bas Nieuwenhuizen <bas@...nieuwenhuizen.nl>,
        Timur Kristóf <timur.kristof@...il.com>,
        michel.daenzer@...lbox.org,
        André Almeida <andrealmeid@...lia.com>
Subject: [RFC PATCH 0/1] Add AMDGPU_INFO_GUILTY_APP ioctl

Currently UMD hasn't much information on what went wrong during a GPU reset. To
help with that, this patch proposes a new IOCTL that can be used to query
information about the resources that caused the hang.

The goal of this RFC is to gather feedback about this interface. The mesa part
can be found at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22785

The current implementation is racy, meaning that if two resets happens (even on
different rings), the app will get the last reset information available, rather
than the one that is looking for. Maybe this can be fixed with a ring_id
parameter to query the information for a specific ring, but this also requires
an interface to tell the UMD which ring caused it.

I know that devcoredump is also used for this kind of information, but I believe
that using an IOCTL is better for interfacing Mesa + Linux rather than parsing
a file that its contents are subjected to be changed.

André Almeida (1):
  drm/amdgpu: Add interface to dump guilty IB on GPU hang

 drivers/gpu/drm/amd/amdgpu/amdgpu.h      |  3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c  |  7 ++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   | 29 ++++++++++++++++++++++++
 include/uapi/drm/amdgpu_drm.h            |  7 ++++++
 7 files changed, 52 insertions(+), 1 deletion(-)

-- 
2.40.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ