lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250408095711.635185-1-jacek.lawrynowicz@linux.intel.com>
Date: Tue,  8 Apr 2025 11:57:11 +0200
From: Jacek Lawrynowicz <jacek.lawrynowicz@...ux.intel.com>
To: linux-kernel@...r.kernel.org
Cc: stable@...r.kernel.org,
	Karol Wachowski <karol.wachowski@...el.com>,
	Jacek Lawrynowicz <jacek.lawrynowicz@...ux.intel.com>
Subject: [PATCH] accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW

From: Karol Wachowski <karol.wachowski@...el.com>

commit dad945c27a42dfadddff1049cf5ae417209a8996 upstream.

Trigger recovery of the NPU upon receiving HW context violation from
the firmware. The context violation error is a fatal error that prevents
any subsequent jobs from being executed. Without this fix it is
necessary to reload the driver to restore the NPU operational state.

This is simplified version of upstream commit as the full implementation
would require all engine reset/resume logic to be backported.

Signed-off-by: Karol Wachowski <karol.wachowski@...el.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@...ux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@...ux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@...ux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-13-maciej.falkowski@linux.intel.com
Fixes: 0adff3b0ef12 ("accel/ivpu: Share NPU busy time in sysfs")
Cc: <stable@...r.kernel.org> # v6.11+
---
 drivers/accel/ivpu/ivpu_job.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index be2e2bf0f43f0..70b3676974407 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -482,6 +482,8 @@ static struct ivpu_job *ivpu_job_remove_from_submitted_jobs(struct ivpu_device *
 	return job;
 }
 
+#define VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW 0xEU
+
 static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32 job_status)
 {
 	struct ivpu_job *job;
@@ -490,6 +492,9 @@ static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32
 	if (!job)
 		return -ENOENT;
 
+	if (job_status == VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW)
+		ivpu_pm_trigger_recovery(vdev, "HW context violation");
+
 	if (job->file_priv->has_mmu_faults)
 		job_status = DRM_IVPU_JOB_STATUS_ABORTED;
 
-- 
2.45.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ