lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aSCT2AuwZeiuP7N9@sultan-box>
Date: Fri, 21 Nov 2025 08:31:20 -0800
From: Sultan Alsawaf <sultan@...neltoast.com>
To: Mario Limonciello <mario.limonciello@....com>
Cc: "Du, Bin" <bin.du@....com>, mchehab@...nel.org, hverkuil@...all.nl,
	laurent.pinchart+renesas@...asonboard.com,
	bryan.odonoghue@...aro.org, sakari.ailus@...ux.intel.com,
	prabhakar.mahadev-lad.rj@...renesas.com,
	linux-media@...r.kernel.org, linux-kernel@...r.kernel.org,
	pratap.nirujogi@....com, benjamin.chan@....com, king.li@....com,
	gjorgji.rosikopulos@....com, Phil.Jawich@....com,
	Dominic.Antony@....com, richard.gong@....com, anson.tsao@....com
Subject: Re: [PATCH v5 0/7] Add AMD ISP4 driver

On Fri, Nov 21, 2025 at 09:46:41AM -0600, Mario Limonciello wrote:
> 
> 
> On 11/21/2025 9:39 AM, Sultan Alsawaf wrote:
> > On Fri, Nov 21, 2025 at 08:32:34AM -0600, Mario Limonciello wrote:
> > > 
> > > 
> > > On 11/21/2025 2:20 AM, Sultan Alsawaf wrote:
> > > > On Wed, Nov 19, 2025 at 06:14:17PM +0800, Du, Bin wrote:
> > > > > 
> > > > > 
> > > > > On 11/18/2025 3:35 PM, Sultan Alsawaf wrote:
> > > > > > On Wed, Nov 12, 2025 at 06:21:33PM +0800, Du, Bin wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 11/12/2025 3:38 PM, Sultan Alsawaf wrote:
> > > > > > > > On Tue, Nov 11, 2025 at 11:06:41PM -0800, Sultan Alsawaf wrote:
> > > > > > > > > On Wed, Nov 12, 2025 at 02:32:51PM +0800, Du, Bin wrote:
> > > > > > > > > > Thanks Sultan for your information.
> > > > > > > > > > 
> > > > > > > > > > On 11/12/2025 9:21 AM, Sultan Alsawaf wrote:
> > > > > > > > > > > On Tue, Nov 11, 2025 at 03:33:42PM -0800, Sultan Alsawaf wrote:
> > > > > > > > > > > > On Tue, Nov 11, 2025 at 05:58:10PM +0800, Du, Bin wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On 11/11/2025 5:05 PM, Sultan Alsawaf wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > On Mon, Nov 10, 2025 at 05:46:28PM +0800, Du, Bin wrote:
> > > > > > > > > > > > > > > Thank you, Sultan, for your time, big effort, and constant support.
> > > > > > > > > > > > > > > Apologies for my delayed reply for being occupied a little with other
> > > > > > > > > > > > > > > matters.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > On 11/10/2025 4:33 PM, Sultan Alsawaf wrote:
> > > > > > > > > > > > > > > > Hi Bin,
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > On Wed, Nov 05, 2025 at 01:25:58AM -0800, Sultan Alsawaf wrote:
> > > > > > > > > > > > > > > > > Hi Bin,
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > To expedite review, I've attached a patch containing a bunch of fixes I've made
> > > > > > > > > > > > > > > > > on top of v5. Most of my changes should be self-explanatory; feel free to ask
> > > > > > > > > > > > > > > > > further about specific changes if you have any questions.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > I haven't had a chance to review all of the v4 -> v5 changes yet, but I figured
> > > > > > > > > > > > > > > > > I should send what I've got so far.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > FYI, there is a regression in isp4if_dequeue_buffer() where the bufq lock isn't
> > > > > > > > > > > > > > > > > protecting the list_del() anymore. I also checked the compiler output when using
> > > > > > > > > > > > > > > > > guard() versus scoped_guard() in that function and there is no difference:
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > >          sha1sum:
> > > > > > > > > > > > > > > > >          5652a0306da22ea741d80a9e03a787d0f71758a8  isp4_interface.o // guard()
> > > > > > > > > > > > > > > > >          5652a0306da22ea741d80a9e03a787d0f71758a8  isp4_interface.o // scoped_guard()
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > So guard() should be used there again, which I've done in my patch.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > I also have a few questions:
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > 1. Does ISP FW provide a register interface to disable the IRQ? If so, it is
> > > > > > > > > > > > > > > > >           faster to use that than using disable_irq_nosync() to disable the IRQ from
> > > > > > > > > > > > > > > > >           the CPU's side.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > 2. When the IRQ is re-enabled in isp4sd_fw_resp_func(), is there anything
> > > > > > > > > > > > > > > > >           beforehand to mask all pending interrupts from the ISP so that there isn't a
> > > > > > > > > > > > > > > > >           bunch of stale interrupts firing as soon the IRQ is re-enabled?
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > 3. Is there any risk of deadlock due to the stream kthread racing with the ISP
> > > > > > > > > > > > > > > > >           when the ISP posts a new response _after_ the kthread determines there are no
> > > > > > > > > > > > > > > > >           more new responses but _before_ the enable_irq() in isp4sd_fw_resp_func()?
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > 4. Why are some lines much longer than before? It seems inconsistent that now
> > > > > > > > > > > > > > > > >           there is a mix of several lines wrapped to 80 cols and many lines going
> > > > > > > > > > > > > > > > >           beyond 80 cols. And there are multiple places where code is wrapped before
> > > > > > > > > > > > > > > > >           reaching 80 cols even with lots of room left, specifically for cases where it
> > > > > > > > > > > > > > > > >           wouldn't hurt readability to put more characters onto each line.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > I've attached a new, complete patch of changes to apply on top of v5. You may
> > > > > > > > > > > > > > > > ignore the incomplete patch from my previous email and use the new one instead.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > I made many changes and also answered questions 1-3 myself.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Please apply this on top of v5 and let me know if you have any questions.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Sure, will review, apply and test your patch accordingly. Your contribution
> > > > > > > > > > > > > > > is greatly appreciated, will let you know if there is any question or
> > > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > BTW, I noticed a strange regression in v5 even without any of my changes: every
> > > > > > > > > > > > > > > > time you use cheese after using it one time, the video will freeze after 30-60
> > > > > > > > > > > > > > > > seconds with this message printed to dmesg:
> > > > > > > > > > > > > > > >          [ 2032.716559] amd_isp_capture amd_isp_capture: -><- fail respid unknown respid(0x30002)
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > And the video never unfreezes. I haven't found the cause for the regression yet;
> > > > > > > > > > > > > > > > can you try to reproduce it?
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Really weird, we don't see this issue either in dev or QA test. Is it 100%
> > > > > > > > > > > > > > > reproducible and any other fail or err in the log?
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Yes, it's 100% reproducible. There's no other message in dmesg, only that one.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Sometimes there is a stop stream error when I close cheese after it froze:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > >         [  656.540307] amd_isp_capture amd_isp_capture: fail to disable stream
> > > > > > > > > > > > > >         [  657.046633] amd_isp_capture amd_isp_capture: fail to stop steam
> > > > > > > > > > > > > >         [  657.047224] amd_isp_capture amd_isp_capture: disabling streaming failed (-110)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > When I revert to v4 I cannot reproduce it at all. It seems to be something in
> > > > > > > > > > > > > > v4 -> v5 that is not fixed by any of my changes.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Hi Sultan, could you please try following modifications on top of v5 to see
> > > > > > > > > > > > > if it helps?
> > > > > > > > > > > > > 
> > > > > > > > > > > > > diff --git a/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
> > > > > > > > > > > > > b/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
> > > > > > > > > > > > > index 39c2265121f9..d571b3873edb 100644
> > > > > > > > > > > > > --- a/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
> > > > > > > > > > > > > +++ b/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
> > > > > > > > > > > > > @@ -97,7 +97,7 @@
> > > > > > > > > > > > > 
> > > > > > > > > > > > > #define ADDR_SPACE_TYPE_GPU_VA          4
> > > > > > > > > > > > > 
> > > > > > > > > > > > > -#define FW_MEMORY_POOL_SIZE             (200 * 1024 * 1024)
> > > > > > > > > > > > > +#define FW_MEMORY_POOL_SIZE             (100 * 1024 * 1024)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > /*
> > > > > > > > > > > > >        * standard ISP mipicsi=>isp
> > > > > > > > > > > > > diff --git a/drivers/media/platform/amd/isp4/isp4_subdev.c
> > > > > > > > > > > > > b/drivers/media/platform/amd/isp4/isp4_subdev.c
> > > > > > > > > > > > > index 248d10076ae8..acbc80aa709e 100644
> > > > > > > > > > > > > --- a/drivers/media/platform/amd/isp4/isp4_subdev.c
> > > > > > > > > > > > > +++ b/drivers/media/platform/amd/isp4/isp4_subdev.c
> > > > > > > > > > > > > @@ -697,7 +697,7 @@ static int isp4sd_stop_resp_proc_threads(struct
> > > > > > > > > > > > > isp4_subdev *isp_subdev)
> > > > > > > > > > > > >              return 0;
> > > > > > > > > > > > > }
> > > > > > > > > > > > > 
> > > > > > > > > > > > > -static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev)
> > > > > > > > > > > > > +static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev, bool
> > > > > > > > > > > > > irq_enabled)
> > > > > > > > > > > > > {
> > > > > > > > > > > > >              struct isp4sd_sensor_info *sensor_info = &isp_subdev->sensor_info;
> > > > > > > > > > > > >              unsigned int perf_state = ISP4SD_PERFORMANCE_STATE_LOW;
> > > > > > > > > > > > > @@ -716,8 +716,9 @@ static int isp4sd_pwroff_and_deinit(struct isp4_subdev
> > > > > > > > > > > > > *isp_subdev)
> > > > > > > > > > > > >                      return 0;
> > > > > > > > > > > > >              }
> > > > > > > > > > > > > 
> > > > > > > > > > > > > -       for (int i = 0; i < ISP4SD_MAX_FW_RESP_STREAM_NUM; i++)
> > > > > > > > > > > > > -               disable_irq(isp_subdev->irq[i]);
> > > > > > > > > > > > > +       if (irq_enabled)
> > > > > > > > > > > > > +               for (int i = 0; i < ISP4SD_MAX_FW_RESP_STREAM_NUM; i++)
> > > > > > > > > > > > > +                       disable_irq(isp_subdev->irq[i]);
> > > > > > > > > > > > > 
> > > > > > > > > > > > >              isp4sd_stop_resp_proc_threads(isp_subdev);
> > > > > > > > > > > > >              dev_dbg(dev, "isp_subdev stop resp proc streads suc");
> > > > > > > > > > > > > @@ -813,7 +814,7 @@ static int isp4sd_pwron_and_init(struct isp4_subdev
> > > > > > > > > > > > > *isp_subdev)
> > > > > > > > > > > > > 
> > > > > > > > > > > > >              return 0;
> > > > > > > > > > > > > err_unlock_and_close:
> > > > > > > > > > > > > -       isp4sd_pwroff_and_deinit(isp_subdev);
> > > > > > > > > > > > > +       isp4sd_pwroff_and_deinit(isp_subdev, false);
> > > > > > > > > > > > >              return -EINVAL;
> > > > > > > > > > > > > }
> > > > > > > > > > > > > 
> > > > > > > > > > > > > @@ -985,7 +986,7 @@ static int isp4sd_set_power(struct v4l2_subdev *sd, int
> > > > > > > > > > > > > on)
> > > > > > > > > > > > >              if (on)
> > > > > > > > > > > > >                      return isp4sd_pwron_and_init(isp_subdev);
> > > > > > > > > > > > >              else
> > > > > > > > > > > > > -               return isp4sd_pwroff_and_deinit(isp_subdev);
> > > > > > > > > > > > > +               return isp4sd_pwroff_and_deinit(isp_subdev, true);
> > > > > > > > > > > > > }
> > > > > > > > > > > > > 
> > > > > > > > > > > > > static const struct v4l2_subdev_core_ops isp4sd_core_ops = {
> > > > > > > > > > > > 
> > > > > > > > > > > > No difference sadly; same symptoms as before. FYI, your email client broke the
> > > > > > > > > > > > patch formatting so I couldn't apply it; it hard wrapped some lines to 80 cols,
> > > > > > > > > > > > replaced tabs with spaces, and removed leading spaces on each context line, so I
> > > > > > > > > > > > had to apply it manually. To confirm I applied it correctly, here is my diff:
> > > > > > > > > > > > 
> > > > > > > > > > > > diff --git a/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h b/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
> > > > > > > > > > > > index 39c2265121f9..d571b3873edb 100644
> > > > > > > > > > > > --- a/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
> > > > > > > > > > > > +++ b/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
> > > > > > > > > > > > @@ -97,7 +97,7 @@
> > > > > > > > > > > >       #define ADDR_SPACE_TYPE_GPU_VA          4
> > > > > > > > > > > > -#define FW_MEMORY_POOL_SIZE             (200 * 1024 * 1024)
> > > > > > > > > > > > +#define FW_MEMORY_POOL_SIZE             (100 * 1024 * 1024)
> > > > > > > > > > > >       /*
> > > > > > > > > > > >        * standard ISP mipicsi=>isp
> > > > > > > > > > > > diff --git a/drivers/media/platform/amd/isp4/isp4_subdev.c b/drivers/media/platform/amd/isp4/isp4_subdev.c
> > > > > > > > > > > > index 4bd2ebf0f694..500ef0af8a41 100644
> > > > > > > > > > > > --- a/drivers/media/platform/amd/isp4/isp4_subdev.c
> > > > > > > > > > > > +++ b/drivers/media/platform/amd/isp4/isp4_subdev.c
> > > > > > > > > > > > @@ -669,7 +669,7 @@ static int isp4sd_stop_resp_proc_threads(struct isp4_subdev *isp_subdev)
> > > > > > > > > > > >       	return 0;
> > > > > > > > > > > >       }
> > > > > > > > > > > > -static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev)
> > > > > > > > > > > > +static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev, bool irq_enabled)
> > > > > > > > > > > >       {
> > > > > > > > > > > >       	struct isp4sd_sensor_info *sensor_info = &isp_subdev->sensor_info;
> > > > > > > > > > > >       	unsigned int perf_state = ISP4SD_PERFORMANCE_STATE_LOW;
> > > > > > > > > > > > @@ -688,8 +688,9 @@ static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev)
> > > > > > > > > > > >       		return 0;
> > > > > > > > > > > >       	}
> > > > > > > > > > > > -	for (int i = 0; i < ISP4SD_MAX_FW_RESP_STREAM_NUM; i++)
> > > > > > > > > > > > -		disable_irq(isp_subdev->irq[i]);
> > > > > > > > > > > > +	if (irq_enabled)
> > > > > > > > > > > > +		for (int i = 0; i < ISP4SD_MAX_FW_RESP_STREAM_NUM; i++)
> > > > > > > > > > > > +			disable_irq(isp_subdev->irq[i]);
> > > > > > > > > > > >       	isp4sd_stop_resp_proc_threads(isp_subdev);
> > > > > > > > > > > >       	dev_dbg(dev, "isp_subdev stop resp proc streads suc");
> > > > > > > > > > > > @@ -785,7 +786,7 @@ static int isp4sd_pwron_and_init(struct isp4_subdev *isp_subdev)
> > > > > > > > > > > >       	return 0;
> > > > > > > > > > > >       err_unlock_and_close:
> > > > > > > > > > > > -	isp4sd_pwroff_and_deinit(isp_subdev);
> > > > > > > > > > > > +	isp4sd_pwroff_and_deinit(isp_subdev, false);
> > > > > > > > > > > >       	return -EINVAL;
> > > > > > > > > > > >       }
> > > > > > > > > > > > @@ -957,7 +958,7 @@ static int isp4sd_set_power(struct v4l2_subdev *sd, int on)
> > > > > > > > > > > >       	if (on)
> > > > > > > > > > > >       		return isp4sd_pwron_and_init(isp_subdev);
> > > > > > > > > > > >       	else
> > > > > > > > > > > > -		return isp4sd_pwroff_and_deinit(isp_subdev);
> > > > > > > > > > > > +		return isp4sd_pwroff_and_deinit(isp_subdev, true);
> > > > > > > > > > > >       }
> > > > > > > > > > > >       static const struct v4l2_subdev_core_ops isp4sd_core_ops = {
> > > > > > > > > > > > 
> > > > > > > > > > > > > On the other hand, please also add the patch in amdgpu which sets *bo to
> > > > > > > > > > > > > NULL in isp_kernel_buffer_alloc() which you mentioned in another thread.
> > > > > > > > > > > > 
> > > > > > > > > > > > Yep, I have been doing all v5 testing with that patch applied.
> > > > > > > > > > > > 
> > > > > > > > > > > > BTW, I have verified the IRQ changes are not the cause for the regression; I
> > > > > > > > > > > > tested with IRQs kept enabled all the time and the issue still occurs.
> > > > > > > > > > > > 
> > > > > > > > > > > > I did observe that ISP stops sending interrupts when the video stream freezes.
> > > > > > > > > > > > And, if I replicate the bug enough times, it seems to permanently break FW until
> > > > > > > > > > > > a full machine reboot. Reloading amd_capture with the v4 driver doesn't recover
> > > > > > > > > > > > the ISP when this happens.
> > > > > > > > > > > > 
> > > > > > > > > > > > As an improvement to the driver, can we do a hard reset of ISP on driver probe?
> > > > > > > > > > > > I am assuming hardware doesn't need too long to settle after hard reset, maybe
> > > > > > > > > > > > a few hundred milliseconds? This would ensure ISP FW is always in a working
> > > > > > > > > > > > state when the driver is loaded.
> > > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Actually, each time the camera is activated, the ISP driver powers on the
> > > > > > > > > > ISP and initiates its firmware from the beginning; when the camera is
> > > > > > > > > > closed, the ISP is powered off..
> > > > > > > > > 
> > > > > > > > > Hmm, well I am able to put the ISP in a completely unusable state that doesn't
> > > > > > > > > recover when I unload and reload amd_capture. Or maybe it is the sensor that is
> > > > > > > > > stuck in a broken state?
> > > > > > > > 
> > > > > > > > Here is the log output when I try to start cheese after unloading and reloading
> > > > > > > > amd_capture, where the ISP is in the broken state that requires rebooting the
> > > > > > > > laptop (annotated with notes of what I saw/did at each point in time):
> > > > > > > > 
> > > > > > > > ==> opened cheese
> > > > > > > > ==> cheese froze after a few seconds
> > > > > > > > ==> closed cheese
> > > > > > > >       [   34.570823] amd_isp_capture amd_isp_capture: fail to disable stream
> > > > > > > >       [   35.077503] amd_isp_capture amd_isp_capture: fail to stop steam
> > > > > > > >       [   35.077525] amd_isp_capture amd_isp_capture: disabling streaming failed (-110)
> > > > > > > > ==> unloaded amd_capture
> > > > > > > > ==> loaded amd_capture
> > > > > > > > ==> opened cheese
> > > > > > > >       [  308.039721] amd_isp_capture amd_isp_capture: ISP CCPU FW boot failed
> > > > > > > >       [  308.043961] amd_isp_capture amd_isp_capture: fail to start isp_subdev interface
> > > > > > > >       [  308.044188] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044194] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044196] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044197] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044198] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044198] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044199] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044200] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044200] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044201] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.044202] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.065226] amd_isp_capture amd_isp_capture: power up isp fail -22
> > > > > > > >       [  308.174814] amd_isp_capture amd_isp_capture: ISP CCPU FW boot failed
> > > > > > > >       [  308.177807] amd_isp_capture amd_isp_capture: fail to start isp_subdev interface
> > > > > > > >       [  308.178036] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178043] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178044] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178045] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178046] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178047] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178048] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178048] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178049] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178050] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.178050] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.198776] amd_isp_capture amd_isp_capture: power up isp fail -22
> > > > > > > >       [  308.306835] amd_isp_capture amd_isp_capture: ISP CCPU FW boot failed
> > > > > > > >       [  308.311533] amd_isp_capture amd_isp_capture: fail to start isp_subdev interface
> > > > > > > >       [  308.311723] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311723] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311724] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311725] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311725] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311726] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311726] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311726] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311727] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311727] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.311727] amd_isp_capture amd_isp_capture: invalid mem_info
> > > > > > > >       [  308.335281] amd_isp_capture amd_isp_capture: power up isp fail -22
> > > > > > > > 
> > > > > > > 
> > > > > > > Thanks Sultan for the detailed info, I agree with you, tend to believe it
> > > > > > > may related to the sensor, I will follow up with the FW team to investigate
> > > > > > > further.
> > > > > > > 
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Sultan
> > > > > > > > > > > 
> > > > > > > > > > > A small update: I reproduced the issue on v4, but it took several more cycles of
> > > > > > > > > > > closing/opening cheese and waiting 30s compared to v5.
> > > > > > > > > > > 
> > > > > > > > > > > Right now my best guess is that this is a timing issue with respect to FW that
> > > > > > > > > > > was exposed by the v5 changes. v5 introduced slight changes in code timing, like
> > > > > > > > > > > with the mutex locks getting replaced by spin locks.
> > > > > > > > > > > 
> > > > > > > > > > > I'll try to insert mdelays to see if I can expose the issue that way on v4.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Could you kindly provide the FW used?
> > > > > > > > > 
> > > > > > > > >       commit a89515d3ff79f12099f7a51b0f4932b881b7720e
> > > > > > > > >       Author: Pratap Nirujogi <pratap.nirujogi@....com>
> > > > > > > > >       Date:   Thu Aug 21 15:40:45 2025 -0400
> > > > > > > > > 
> > > > > > > > >           amdgpu: Update ISP FW for isp v4.1.1
> > > > > > > > >           From internal git commit:
> > > > > > > > >           24557b7326e539183b3bc44cf8e1496c74d383d6
> > > > > > > > >           Signed-off-by: Pratap Nirujogi <pratap.nirujogi@....com>
> > > > > > > > > 
> > > > > > > > > Sultan
> > > > > > > > 
> > > > > > > > Sultan
> > > > > > > 
> > > > > > > -- 
> > > > > > > Regards,
> > > > > > > Bin
> > > > > > > 
> > > > > > 
> > > > > > Thanks, Bin. I looked deeper at the code and didn't find any reason the issue
> > > > > > could be due to the driver. Also, the problem happens outside of cheese for me
> > > > > > (like in Chromium with Zoom meetings), so v5 of the driver is pretty much
> > > > > > unusable for me until this issue is fixed. :(
> > > > > > 
> > > > > 
> > > > > Oh, really sad to hear that :(, there must be some difference between our
> > > > > platforms because we still can't reproduce the issue you mentioned, to help
> > > > > on this, would you share more info like your Ubuntu version, Kernel
> > > > > version/source, ISP driver version, BIOS version, .config used to build the
> > > > > kernel, FW: commit a89515d3ff79f12099f7a51b0f4932b881b7720e.
> > > > > Just wondering, if possible, could you provide the kernel image either so we
> > > > > can directly test on it. Also, the HW is not broken, right?
> > > > 
> > > > I figured out why you cannot reproduce the issue. You need to pass amd_iommu=off
> > > > on the kernel command line to trigger the issue.
> > > > 
> > > > The reason I am using amd_iommu=off is because this laptop otherwise doesn't
> > > > ever wake from suspend under Linux once it reaches the S0i3 state. The keyboard,
> > > > power button, and lid do not respond to wake up the laptop from suspend. This
> > > > happens 100% of the time once S0i3 is reached, and occurs on the OEM Ubuntu
> > > > image from HP as well. The only fix I have found is to pass amd_iommu=off, which
> > > > other owners of this laptop also found fixes the issue.
> > > 
> > > You're the first report I've heard of this.
> > > 
> > > Are you using storage encryption or a storage password of any kind (software
> > > or hardware) by chance?
> > > 
> > > If you are can you please describe it?
> > > 
> > > Also can you generate a report using amd-s2idle?  I don't think it's going
> > > to flag anything but I would like to confirm.
> > 
> > The issue is mentioned on a Reddit post [1]. A specific comment mentions
> > amd_iommu=off fixing the issue [2], which is where I got the idea to do that.
> > 
> > You will find conflicting reports about this issue online, with some people
> > saying it doesn't happen anymore after some kernel update [3], and others saying
> > it worked until something updated [4].
> > 
> > The reason for all the conflicting reports online is because this issue only
> > occurs when S0i3 is reached, and I suspect that no one realized there's a delay
> > before S0i3 is entered. Reaching S0i3 appears to take at least 60 seconds
> > *after* suspending the laptop. If S0i3 isn't entered, then you *can* wake the
> > laptop but there will always be this message indicating S0i3 wasn't hit:
> >    [  107.428836] amd_pmc AMDI000B:00: Last suspend didn't reach deepest state
> > 
> 
> It shouldn't take 60 seconds to enter s0i3.  It should be ~5 seconds. So are
> you saying that if you have IOMMU enabled and interrupt the suspend around
> 20 seconds later you get that you didn't reach deepest sleep state, and if
> you wait longer it hangs?
> 
> > I am using LUKS1 encryption on my storage (software encryption). However, I'm
> > not sure any of my configuration info is relevant because I reproduced the issue
> > from a live USB using HP's OEM Ubuntu image [5], with nothing else physically
> > plugged into the laptop and not connected to anything over WiFi or Bluetooth.
> 
> Is the SSD a SED?  Anything for storage password set in BIOS?
> 
> > 
> > I had the thought of generating a report using amd-s2idle a couple months ago...
> > except I have no way to wake the machine from suspend at all. I have to hold the
> > the power button to do a hard shutdown. I tried using no_console_suspend but of
> > course userspace processes are frozen so systemd couldn't record anything for
> > me. I tried UART over USB and connected the output to another laptop but it
> > would only work for a few seconds right after booting up the laptop (could've
> > just been because I was using PL2303 serial converters, which aren't so great).
> > 
> > I have also tried several different combinations of settings toggled on/off in
> > the BIOS setup menu, as well as resetting to the factory default values, without
> > any change in behavior.
> > 
> > I'm at a loss on how I can retrieve some debug info for this issue. :/
> 
> You are on the latest BIOS presumably, right?
> 
> If you schedule a suspend with amd-s2idle for ~10 seconds, does it reproduce
> too?

Oh my God, I ran `amd-s2idle test` and got this:

  ❌ IOMMU is misconfigured: missing MSFT0201 ACPI device
  [...]
  🚫 Your system does not meet s2idle prerequisites!
  🗣 Explanations for your system
  🚦 Device MSFT0201 missing from ACPI tables
  The ACPI device MSFT0201 is required for suspend to work when the IOMMU is enabled. Please check your BIOS settings and if configured correctly, report a bug to your system vendor.
  For more information on this failure see:https://gitlab.freedesktop.org/drm/amd/-/issues/3738#note_2667140

So then I reenabled Pluton in the BIOS and waking from suspend works now!!!

This had slipped past my test with BIOS settings reset to factory defaults
because the BIOS has a separate button to reset *security settings* to factory
defaults. And Pluton is one of those security settings.

When I had Pluton disabled, it always took at least 60 seconds to enter S0i3,
measured on a stopwatch. Now S0i3 entry takes much less time as you say.

Well, that fixes a bunch of struggles I had with this laptop. :) Thank you!

Also, since I have your attention on S0i3, there is always this warning splat
printed on resume from S0i3, both with and without IOMMU enabled:

  [  366.694362] ------------[ cut here ]------------
  [  366.694367] amdgpu 0000:c3:00.0: SMU uninitialized but power ungate requested for 16!
  [  366.694427] WARNING: CPU: 12 PID: 3122 at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:398 smu_dpm_set_power_gate+0x1d7/0x1f0 [amdgpu]
  [  366.694640] Modules linked in: ccm hid_sensor_gyro_3d hid_sensor_prox hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat snd_acp_legacy_mach joydev snd_acp_mach mousedev intel_rapl_msr snd_soc_nau8821 snd_hda_scodec_cs35l56_spi intel_rapl_common snd_acp3x_rn amdgpu snd_acp70 snd_ctl_led snd_acp_i2s snd_acp_pdm snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269 snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp snd_sof_pci snd_hda_codec_realtek_lib snd_sof_xtensa_dsp snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match snd_amd_sdw_acpi soundwire_amd soundwire_generic_allocation mt7925e soundwire_bus
  [  366.694715]  snd_hda_codec_atihdmi mt7925_common snd_soc_sdca snd_hda_codec_hdmi mt792x_lib snd_soc_core mt76_connac_lib snd_compress drm_panel_backlight_quirks amdxcp btusb ac97_bus drm_buddy snd_hda_intel mt76 snd_pcm_dmaengine btrtl drm_exec snd_rpl_pci_acp6x drm_suballoc_helper snd_hda_codec btintel drm_ttm_helper btbcm mac80211 snd_hda_scodec_cs35l56_i2c snd_acp_pci snd_hda_core ttm btmtk ucsi_acpi snd_hda_scodec_cs35l56 snd_amd_acpi_mach libarc4 snd_intel_dspcfg snd_hda_cirrus_scodec i2c_algo_bit typec_ucsi snd_acp_legacy_common spd5118 snd_intel_sdw_acpi bluetooth drm_display_helper snd_soc_cs35l56_shared snd_pci_acp6x snd_hwdep snd_soc_cs_amp_lib typec hp_wmi cfg80211 cs_dsp cec kvm_amd snd_pci_acp5x snd_pcm hid_multitouch ecdh_generic roles sp5100_tco sparse_keymap wmi_bmof amd_pmf kvm snd_timer snd_rn_pci_acp3x i2c_hid_acpi snd_acp_config video amdtee serial_multi_instantiate i2c_hid irqbypass i2c_piix4 snd snd_soc_acpi amdxdna snd_pci_acp3x soundcore amd_sfh platform_profile wmi i2c_smbus rfkill
  [  366.694807]  wireless_hotkey thunderbolt amd_pmc gpu_sched rapl mac_hid i2c_dev sg crypto_user loop nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni ghash_clmulni_intel aesni_intel nvme nvme_core serio_raw nvme_keyring ccp nvme_auth
  [  366.694840] CPU: 12 UID: 0 PID: 3122 Comm: kworker/u129:47 Tainted: G        W           6.17.7 #1 PREEMPT 
  [  366.694846] Tainted: [W]=WARN
  [  366.694848] Hardware name: HP HP ZBook Ultra G1a 14 inch Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.02 06/18/2025
  [  366.694852] Workqueue: async async_run_entry_fn
  [  366.694867] RIP: 0010:smu_dpm_set_power_gate+0x1d7/0x1f0 [amdgpu]
  [  366.694974] Code: 85 ed 75 03 48 8b 2f 89 74 24 04 e8 f3 85 da cb 44 8b 44 24 04 48 89 d9 48 89 ea 48 89 c6 48 c7 c7 48 80 fc c1 e8 c9 0d 63 cb <0f> 0b b8 a1 ff ff ff e9 a1 fe ff ff e9 3b b3 3b 00 e9 36 b3 3b 00
  [  366.694977] RSP: 0018:ffff8fad27387ce8 EFLAGS: 00010246
  [  366.694981] RAX: 0000000000000000 RBX: ffffffffc2006846 RCX: 0000000000000027
  [  366.694984] RDX: ffff8fcbde51abc8 RSI: 0000000000000001 RDI: ffff8fcbde51abc0
  [  366.694985] RBP: ffff8fad016afc80 R08: 0000000000000000 R09: 00000000ffffdfff
  [  366.694986] R10: ffffffff8e6d5da0 R11: ffff8fad27387b88 R12: ffff8fad25a80000
  [  366.694987] R13: ffff8fad25a96680 R14: 0000000000000001 R15: ffffffffc1e7ce80
  [  366.694989] FS:  0000000000000000(0000) GS:ffff8fcc4fe73000(0000) knlGS:0000000000000000
  [  366.694990] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  366.694992] CR2: 00007f58f6956408 CR3: 0000000116015000 CR4: 0000000000f50ef0
  [  366.694993] PKRU: 55555554
  [  366.694995] Call Trace:
  [  366.695003]  <TASK>
  [  366.695007]  amdgpu_dpm_set_powergating_by_smu+0xf1/0x110 [amdgpu]
  [  366.695136]  _genpd_power_on+0x83/0x120
  [  366.695149]  genpd_sync_power_on.part.0+0x66/0xc0
  [  366.695154]  genpd_finish_resume+0x6f/0xd0
  [  366.695157]  ? genpd_thaw_noirq+0x10/0x10
  [  366.695159]  dpm_run_callback.isra.0+0x28/0x90
  [  366.695166]  device_resume_noirq+0xc7/0x210
  [  366.695169]  async_resume_noirq+0x1c/0x30
  [  366.695171]  async_run_entry_fn+0x1f/0xa0
  [  366.695175]  process_one_work+0x173/0x270
  [  366.695183]  worker_thread+0x2d7/0x410
  [  366.695188]  ? rescuer_thread+0x4e0/0x4e0
  [  366.695191]  kthread+0xe6/0x1e0
  [  366.695196]  ? kthread_queue_delayed_work+0x80/0x80
  [  366.695199]  ? kthread_queue_delayed_work+0x80/0x80
  [  366.695202]  ret_from_fork+0xf0/0x110
  [  366.695211]  ? kthread_queue_delayed_work+0x80/0x80
  [  366.695214]  ? kthread_queue_delayed_work+0x80/0x80
  [  366.695217]  ret_from_fork_asm+0x11/0x20
  [  366.695224]  </TASK>
  [  366.695225] ---[ end trace 0000000000000000 ]---


> > 
> > [1] https://www.reddit.com/r/AMDLaptops/comments/1mmrlgz/hp_zbook_ultra_g1a_ubuntu_fully_working_now_or/
> > [2] https://www.reddit.com/r/AMDLaptops/comments/1mmrlgz/comment/nd4cldp/
> > [3] https://forum.level1techs.com/t/the-ultimate-arch-secureboot-guide-for-ryzen-ai-max-ft-hp-g1a-128gb-8060s-monster-laptop/230652#hibernate-suspend-and-kernel-versions-16
> > [4] https://www.reddit.com/r/AMDLaptops/comments/1mmrlgz/comment/nd1xbtd/
> > [5] https://ftp.hp.com/pub/softpaq/sp158501-159000/stella-noble-oem-24.04b-20250422-107.iso
> > 
> > Sultan
> 

Sultan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ