lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240401212002.1191549-1-nunes.erico@gmail.com>
Date: Mon,  1 Apr 2024 23:20:00 +0200
From: Erico Nunes <nunes.erico@...il.com>
To: Qiang Yu <yuq825@...il.com>,
	anarsoul@...il.com,
	dri-devel@...ts.freedesktop.org,
	lima@...ts.freedesktop.org
Cc: Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
	Maxime Ripard <mripard@...nel.org>,
	Thomas Zimmermann <tzimmermann@...e.de>,
	David Airlie <airlied@...il.com>,
	Daniel Vetter <daniel@...ll.ch>,
	christian.koenig@....com,
	megi@....cz,
	linux-kernel@...r.kernel.org,
	Erico Nunes <nunes.erico@...il.com>
Subject: [PATCH 0/2] drm/lima: fix devfreq refcount imbalance for job timeouts

This is a followup to https://patchwork.freedesktop.org/series/128856/

Patch 1 rev1 from that series
https://patchwork.freedesktop.org/patch/574745/?series=128856&rev=1
was dropped because it needed a better solution for a race condition
between the irq and the timeout handler.
The proposed solution in that discussion is to solve the race condition
by masking the irqs during the timeout handler execution, which is what
is done here.
This bug is very hard to reproduce with regular applications, but I
found it to be reliable to reproduce with a program that triggers many
jobs right in the boundary between timeouting, so that jobs still manage
to complete while the timeout handler runs.

With this series, I was unable to further reproduce the bug.

At first I had only the pp and gp irqs masked and the problem never
reproduced again on Mali-400, but I still managed to reproduce it on
Mali-450 after hours of test time. After masking the pp bcast irq as
well I was not able to reproduce it anymore even on Mali-450, so I think
that was the missing bit for it.

Erico Nunes (2):
  drm/lima: add mask irq callback to gp and pp
  drm/lima: mask irqs in timeout path before hard reset

 drivers/gpu/drm/lima/lima_bcast.c | 12 ++++++++++++
 drivers/gpu/drm/lima/lima_bcast.h |  3 +++
 drivers/gpu/drm/lima/lima_gp.c    |  8 ++++++++
 drivers/gpu/drm/lima/lima_pp.c    | 18 ++++++++++++++++++
 drivers/gpu/drm/lima/lima_sched.c |  9 +++++++++
 drivers/gpu/drm/lima/lima_sched.h |  1 +
 6 files changed, 51 insertions(+)

-- 
2.44.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ