lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240117031212.1104034-2-nunes.erico@gmail.com>
Date: Wed, 17 Jan 2024 04:12:07 +0100
From: Erico Nunes <nunes.erico@...il.com>
To: Qiang Yu <yuq825@...il.com>,
	dri-devel@...ts.freedesktop.org,
	lima@...ts.freedesktop.org
Cc: anarsoul@...il.com,
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
	Maxime Ripard <mripard@...nel.org>,
	Thomas Zimmermann <tzimmermann@...e.de>,
	David Airlie <airlied@...il.com>,
	Daniel Vetter <daniel@...ll.ch>,
	Sumit Semwal <sumit.semwal@...aro.org>,
	christian.koenig@....com,
	linux-kernel@...r.kernel.org,
	Erico Nunes <nunes.erico@...il.com>
Subject: [PATCH v1 1/6] drm/lima: fix devfreq refcount imbalance for job timeouts

In case a task manages to complete but it took just long enough to also
trigger the timeout handler, the current code results in a refcount
imbalance on lima_pm_idle.

While this can be a rare occurrence, when it happens it may fill user
logs with stack traces such as:

[10136.669170] WARNING: CPU: 0 PID: 0 at drivers/gpu/drm/lima/lima_devfreq.c:205 lima_devfreq_record_idle+0xa0/0xb0
..
[10136.669459] pc : lima_devfreq_record_idle+0xa0/0xb0
..
[10136.669628] Call trace:
[10136.669634]  lima_devfreq_record_idle+0xa0/0xb0
[10136.669646]  lima_sched_pipe_task_done+0x5c/0xb0
[10136.669656]  lima_gp_irq_handler+0xa8/0x120
[10136.669666]  __handle_irq_event_percpu+0x48/0x160
[10136.669679]  handle_irq_event+0x4c/0xc0

The imbalance happens because lima_sched_pipe_task_done() already calls
lima_pm_idle for this case if there was no error.
Check the error flag in the timeout handler to ensure we can never run
into this case.

Signed-off-by: Erico Nunes <nunes.erico@...il.com>
---
 drivers/gpu/drm/lima/lima_sched.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index c3bf8cda8498..66317296d831 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -427,7 +427,8 @@ static enum drm_gpu_sched_stat lima_sched_timedout_job(struct drm_sched_job *job
 	pipe->current_vm = NULL;
 	pipe->current_task = NULL;
 
-	lima_pm_idle(ldev);
+	if (pipe->error)
+		lima_pm_idle(ldev);
 
 	drm_sched_resubmit_jobs(&pipe->base);
 	drm_sched_start(&pipe->base, true);
-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ