lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d96c75d-e8fb-446b-a85c-803a2b5212ed@linux.intel.com>
Date: Thu, 10 Apr 2025 09:49:37 +0200
From: Jacek Lawrynowicz <jacek.lawrynowicz@...ux.intel.com>
To: linux-kernel@...r.kernel.org,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: stable@...r.kernel.org, Karol Wachowski <karol.wachowski@...el.com>
Subject: Re: [PATCH] accel/ivpu: Add handling of
 VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW

Hi,

This is an important patch for the Intel NPU.
Is there anything it is missing to be included in stable?

Regards,
Jacek

On 4/8/2025 11:57 AM, Jacek Lawrynowicz wrote:
> From: Karol Wachowski <karol.wachowski@...el.com>
> 
> commit dad945c27a42dfadddff1049cf5ae417209a8996 upstream.
> 
> Trigger recovery of the NPU upon receiving HW context violation from
> the firmware. The context violation error is a fatal error that prevents
> any subsequent jobs from being executed. Without this fix it is
> necessary to reload the driver to restore the NPU operational state.
> 
> This is simplified version of upstream commit as the full implementation
> would require all engine reset/resume logic to be backported.
> 
> Signed-off-by: Karol Wachowski <karol.wachowski@...el.com>
> Signed-off-by: Maciej Falkowski <maciej.falkowski@...ux.intel.com>
> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@...ux.intel.com>
> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@...ux.intel.com>
> Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-13-maciej.falkowski@linux.intel.com
> Fixes: 0adff3b0ef12 ("accel/ivpu: Share NPU busy time in sysfs")
> Cc: <stable@...r.kernel.org> # v6.11+
> ---
>  drivers/accel/ivpu/ivpu_job.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
> index be2e2bf0f43f0..70b3676974407 100644
> --- a/drivers/accel/ivpu/ivpu_job.c
> +++ b/drivers/accel/ivpu/ivpu_job.c
> @@ -482,6 +482,8 @@ static struct ivpu_job *ivpu_job_remove_from_submitted_jobs(struct ivpu_device *
>  	return job;
>  }
>  
> +#define VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW 0xEU
> +
>  static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32 job_status)
>  {
>  	struct ivpu_job *job;
> @@ -490,6 +492,9 @@ static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32
>  	if (!job)
>  		return -ENOENT;
>  
> +	if (job_status == VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW)
> +		ivpu_pm_trigger_recovery(vdev, "HW context violation");
> +
>  	if (job->file_priv->has_mmu_faults)
>  		job_status = DRM_IVPU_JOB_STATUS_ABORTED;
>  

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ