lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ccb0a045-fc00-49bc-9441-09d5a0b9aa51@amd.com>
Date: Fri, 21 Nov 2025 09:46:41 -0600
From: Mario Limonciello <mario.limonciello@....com>
To: Sultan Alsawaf <sultan@...neltoast.com>
Cc: "Du, Bin" <bin.du@....com>, mchehab@...nel.org, hverkuil@...all.nl,
 laurent.pinchart+renesas@...asonboard.com, bryan.odonoghue@...aro.org,
 sakari.ailus@...ux.intel.com, prabhakar.mahadev-lad.rj@...renesas.com,
 linux-media@...r.kernel.org, linux-kernel@...r.kernel.org,
 pratap.nirujogi@....com, benjamin.chan@....com, king.li@....com,
 gjorgji.rosikopulos@....com, Phil.Jawich@....com, Dominic.Antony@....com,
 richard.gong@....com, anson.tsao@....com
Subject: Re: [PATCH v5 0/7] Add AMD ISP4 driver



On 11/21/2025 9:39 AM, Sultan Alsawaf wrote:
> On Fri, Nov 21, 2025 at 08:32:34AM -0600, Mario Limonciello wrote:
>>
>>
>> On 11/21/2025 2:20 AM, Sultan Alsawaf wrote:
>>> On Wed, Nov 19, 2025 at 06:14:17PM +0800, Du, Bin wrote:
>>>>
>>>>
>>>> On 11/18/2025 3:35 PM, Sultan Alsawaf wrote:
>>>>> On Wed, Nov 12, 2025 at 06:21:33PM +0800, Du, Bin wrote:
>>>>>>
>>>>>>
>>>>>> On 11/12/2025 3:38 PM, Sultan Alsawaf wrote:
>>>>>>> On Tue, Nov 11, 2025 at 11:06:41PM -0800, Sultan Alsawaf wrote:
>>>>>>>> On Wed, Nov 12, 2025 at 02:32:51PM +0800, Du, Bin wrote:
>>>>>>>>> Thanks Sultan for your information.
>>>>>>>>>
>>>>>>>>> On 11/12/2025 9:21 AM, Sultan Alsawaf wrote:
>>>>>>>>>> On Tue, Nov 11, 2025 at 03:33:42PM -0800, Sultan Alsawaf wrote:
>>>>>>>>>>> On Tue, Nov 11, 2025 at 05:58:10PM +0800, Du, Bin wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/11/2025 5:05 PM, Sultan Alsawaf wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 10, 2025 at 05:46:28PM +0800, Du, Bin wrote:
>>>>>>>>>>>>>> Thank you, Sultan, for your time, big effort, and constant support.
>>>>>>>>>>>>>> Apologies for my delayed reply for being occupied a little with other
>>>>>>>>>>>>>> matters.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/10/2025 4:33 PM, Sultan Alsawaf wrote:
>>>>>>>>>>>>>>> Hi Bin,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Nov 05, 2025 at 01:25:58AM -0800, Sultan Alsawaf wrote:
>>>>>>>>>>>>>>>> Hi Bin,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To expedite review, I've attached a patch containing a bunch of fixes I've made
>>>>>>>>>>>>>>>> on top of v5. Most of my changes should be self-explanatory; feel free to ask
>>>>>>>>>>>>>>>> further about specific changes if you have any questions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I haven't had a chance to review all of the v4 -> v5 changes yet, but I figured
>>>>>>>>>>>>>>>> I should send what I've got so far.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> FYI, there is a regression in isp4if_dequeue_buffer() where the bufq lock isn't
>>>>>>>>>>>>>>>> protecting the list_del() anymore. I also checked the compiler output when using
>>>>>>>>>>>>>>>> guard() versus scoped_guard() in that function and there is no difference:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>          sha1sum:
>>>>>>>>>>>>>>>>          5652a0306da22ea741d80a9e03a787d0f71758a8  isp4_interface.o // guard()
>>>>>>>>>>>>>>>>          5652a0306da22ea741d80a9e03a787d0f71758a8  isp4_interface.o // scoped_guard()
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So guard() should be used there again, which I've done in my patch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also have a few questions:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. Does ISP FW provide a register interface to disable the IRQ? If so, it is
>>>>>>>>>>>>>>>>           faster to use that than using disable_irq_nosync() to disable the IRQ from
>>>>>>>>>>>>>>>>           the CPU's side.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2. When the IRQ is re-enabled in isp4sd_fw_resp_func(), is there anything
>>>>>>>>>>>>>>>>           beforehand to mask all pending interrupts from the ISP so that there isn't a
>>>>>>>>>>>>>>>>           bunch of stale interrupts firing as soon the IRQ is re-enabled?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 3. Is there any risk of deadlock due to the stream kthread racing with the ISP
>>>>>>>>>>>>>>>>           when the ISP posts a new response _after_ the kthread determines there are no
>>>>>>>>>>>>>>>>           more new responses but _before_ the enable_irq() in isp4sd_fw_resp_func()?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 4. Why are some lines much longer than before? It seems inconsistent that now
>>>>>>>>>>>>>>>>           there is a mix of several lines wrapped to 80 cols and many lines going
>>>>>>>>>>>>>>>>           beyond 80 cols. And there are multiple places where code is wrapped before
>>>>>>>>>>>>>>>>           reaching 80 cols even with lots of room left, specifically for cases where it
>>>>>>>>>>>>>>>>           wouldn't hurt readability to put more characters onto each line.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've attached a new, complete patch of changes to apply on top of v5. You may
>>>>>>>>>>>>>>> ignore the incomplete patch from my previous email and use the new one instead.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I made many changes and also answered questions 1-3 myself.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please apply this on top of v5 and let me know if you have any questions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sure, will review, apply and test your patch accordingly. Your contribution
>>>>>>>>>>>>>> is greatly appreciated, will let you know if there is any question or
>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> BTW, I noticed a strange regression in v5 even without any of my changes: every
>>>>>>>>>>>>>>> time you use cheese after using it one time, the video will freeze after 30-60
>>>>>>>>>>>>>>> seconds with this message printed to dmesg:
>>>>>>>>>>>>>>>          [ 2032.716559] amd_isp_capture amd_isp_capture: -><- fail respid unknown respid(0x30002)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And the video never unfreezes. I haven't found the cause for the regression yet;
>>>>>>>>>>>>>>> can you try to reproduce it?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Really weird, we don't see this issue either in dev or QA test. Is it 100%
>>>>>>>>>>>>>> reproducible and any other fail or err in the log?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, it's 100% reproducible. There's no other message in dmesg, only that one.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sometimes there is a stop stream error when I close cheese after it froze:
>>>>>>>>>>>>>
>>>>>>>>>>>>>         [  656.540307] amd_isp_capture amd_isp_capture: fail to disable stream
>>>>>>>>>>>>>         [  657.046633] amd_isp_capture amd_isp_capture: fail to stop steam
>>>>>>>>>>>>>         [  657.047224] amd_isp_capture amd_isp_capture: disabling streaming failed (-110)
>>>>>>>>>>>>>
>>>>>>>>>>>>> When I revert to v4 I cannot reproduce it at all. It seems to be something in
>>>>>>>>>>>>> v4 -> v5 that is not fixed by any of my changes.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Sultan, could you please try following modifications on top of v5 to see
>>>>>>>>>>>> if it helps?
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
>>>>>>>>>>>> b/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
>>>>>>>>>>>> index 39c2265121f9..d571b3873edb 100644
>>>>>>>>>>>> --- a/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
>>>>>>>>>>>> +++ b/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
>>>>>>>>>>>> @@ -97,7 +97,7 @@
>>>>>>>>>>>>
>>>>>>>>>>>> #define ADDR_SPACE_TYPE_GPU_VA          4
>>>>>>>>>>>>
>>>>>>>>>>>> -#define FW_MEMORY_POOL_SIZE             (200 * 1024 * 1024)
>>>>>>>>>>>> +#define FW_MEMORY_POOL_SIZE             (100 * 1024 * 1024)
>>>>>>>>>>>>
>>>>>>>>>>>> /*
>>>>>>>>>>>>        * standard ISP mipicsi=>isp
>>>>>>>>>>>> diff --git a/drivers/media/platform/amd/isp4/isp4_subdev.c
>>>>>>>>>>>> b/drivers/media/platform/amd/isp4/isp4_subdev.c
>>>>>>>>>>>> index 248d10076ae8..acbc80aa709e 100644
>>>>>>>>>>>> --- a/drivers/media/platform/amd/isp4/isp4_subdev.c
>>>>>>>>>>>> +++ b/drivers/media/platform/amd/isp4/isp4_subdev.c
>>>>>>>>>>>> @@ -697,7 +697,7 @@ static int isp4sd_stop_resp_proc_threads(struct
>>>>>>>>>>>> isp4_subdev *isp_subdev)
>>>>>>>>>>>>              return 0;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> -static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev)
>>>>>>>>>>>> +static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev, bool
>>>>>>>>>>>> irq_enabled)
>>>>>>>>>>>> {
>>>>>>>>>>>>              struct isp4sd_sensor_info *sensor_info = &isp_subdev->sensor_info;
>>>>>>>>>>>>              unsigned int perf_state = ISP4SD_PERFORMANCE_STATE_LOW;
>>>>>>>>>>>> @@ -716,8 +716,9 @@ static int isp4sd_pwroff_and_deinit(struct isp4_subdev
>>>>>>>>>>>> *isp_subdev)
>>>>>>>>>>>>                      return 0;
>>>>>>>>>>>>              }
>>>>>>>>>>>>
>>>>>>>>>>>> -       for (int i = 0; i < ISP4SD_MAX_FW_RESP_STREAM_NUM; i++)
>>>>>>>>>>>> -               disable_irq(isp_subdev->irq[i]);
>>>>>>>>>>>> +       if (irq_enabled)
>>>>>>>>>>>> +               for (int i = 0; i < ISP4SD_MAX_FW_RESP_STREAM_NUM; i++)
>>>>>>>>>>>> +                       disable_irq(isp_subdev->irq[i]);
>>>>>>>>>>>>
>>>>>>>>>>>>              isp4sd_stop_resp_proc_threads(isp_subdev);
>>>>>>>>>>>>              dev_dbg(dev, "isp_subdev stop resp proc streads suc");
>>>>>>>>>>>> @@ -813,7 +814,7 @@ static int isp4sd_pwron_and_init(struct isp4_subdev
>>>>>>>>>>>> *isp_subdev)
>>>>>>>>>>>>
>>>>>>>>>>>>              return 0;
>>>>>>>>>>>> err_unlock_and_close:
>>>>>>>>>>>> -       isp4sd_pwroff_and_deinit(isp_subdev);
>>>>>>>>>>>> +       isp4sd_pwroff_and_deinit(isp_subdev, false);
>>>>>>>>>>>>              return -EINVAL;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> @@ -985,7 +986,7 @@ static int isp4sd_set_power(struct v4l2_subdev *sd, int
>>>>>>>>>>>> on)
>>>>>>>>>>>>              if (on)
>>>>>>>>>>>>                      return isp4sd_pwron_and_init(isp_subdev);
>>>>>>>>>>>>              else
>>>>>>>>>>>> -               return isp4sd_pwroff_and_deinit(isp_subdev);
>>>>>>>>>>>> +               return isp4sd_pwroff_and_deinit(isp_subdev, true);
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> static const struct v4l2_subdev_core_ops isp4sd_core_ops = {
>>>>>>>>>>>
>>>>>>>>>>> No difference sadly; same symptoms as before. FYI, your email client broke the
>>>>>>>>>>> patch formatting so I couldn't apply it; it hard wrapped some lines to 80 cols,
>>>>>>>>>>> replaced tabs with spaces, and removed leading spaces on each context line, so I
>>>>>>>>>>> had to apply it manually. To confirm I applied it correctly, here is my diff:
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h b/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
>>>>>>>>>>> index 39c2265121f9..d571b3873edb 100644
>>>>>>>>>>> --- a/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
>>>>>>>>>>> +++ b/drivers/media/platform/amd/isp4/isp4_fw_cmd_resp.h
>>>>>>>>>>> @@ -97,7 +97,7 @@
>>>>>>>>>>>       #define ADDR_SPACE_TYPE_GPU_VA          4
>>>>>>>>>>> -#define FW_MEMORY_POOL_SIZE             (200 * 1024 * 1024)
>>>>>>>>>>> +#define FW_MEMORY_POOL_SIZE             (100 * 1024 * 1024)
>>>>>>>>>>>       /*
>>>>>>>>>>>        * standard ISP mipicsi=>isp
>>>>>>>>>>> diff --git a/drivers/media/platform/amd/isp4/isp4_subdev.c b/drivers/media/platform/amd/isp4/isp4_subdev.c
>>>>>>>>>>> index 4bd2ebf0f694..500ef0af8a41 100644
>>>>>>>>>>> --- a/drivers/media/platform/amd/isp4/isp4_subdev.c
>>>>>>>>>>> +++ b/drivers/media/platform/amd/isp4/isp4_subdev.c
>>>>>>>>>>> @@ -669,7 +669,7 @@ static int isp4sd_stop_resp_proc_threads(struct isp4_subdev *isp_subdev)
>>>>>>>>>>>       	return 0;
>>>>>>>>>>>       }
>>>>>>>>>>> -static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev)
>>>>>>>>>>> +static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev, bool irq_enabled)
>>>>>>>>>>>       {
>>>>>>>>>>>       	struct isp4sd_sensor_info *sensor_info = &isp_subdev->sensor_info;
>>>>>>>>>>>       	unsigned int perf_state = ISP4SD_PERFORMANCE_STATE_LOW;
>>>>>>>>>>> @@ -688,8 +688,9 @@ static int isp4sd_pwroff_and_deinit(struct isp4_subdev *isp_subdev)
>>>>>>>>>>>       		return 0;
>>>>>>>>>>>       	}
>>>>>>>>>>> -	for (int i = 0; i < ISP4SD_MAX_FW_RESP_STREAM_NUM; i++)
>>>>>>>>>>> -		disable_irq(isp_subdev->irq[i]);
>>>>>>>>>>> +	if (irq_enabled)
>>>>>>>>>>> +		for (int i = 0; i < ISP4SD_MAX_FW_RESP_STREAM_NUM; i++)
>>>>>>>>>>> +			disable_irq(isp_subdev->irq[i]);
>>>>>>>>>>>       	isp4sd_stop_resp_proc_threads(isp_subdev);
>>>>>>>>>>>       	dev_dbg(dev, "isp_subdev stop resp proc streads suc");
>>>>>>>>>>> @@ -785,7 +786,7 @@ static int isp4sd_pwron_and_init(struct isp4_subdev *isp_subdev)
>>>>>>>>>>>       	return 0;
>>>>>>>>>>>       err_unlock_and_close:
>>>>>>>>>>> -	isp4sd_pwroff_and_deinit(isp_subdev);
>>>>>>>>>>> +	isp4sd_pwroff_and_deinit(isp_subdev, false);
>>>>>>>>>>>       	return -EINVAL;
>>>>>>>>>>>       }
>>>>>>>>>>> @@ -957,7 +958,7 @@ static int isp4sd_set_power(struct v4l2_subdev *sd, int on)
>>>>>>>>>>>       	if (on)
>>>>>>>>>>>       		return isp4sd_pwron_and_init(isp_subdev);
>>>>>>>>>>>       	else
>>>>>>>>>>> -		return isp4sd_pwroff_and_deinit(isp_subdev);
>>>>>>>>>>> +		return isp4sd_pwroff_and_deinit(isp_subdev, true);
>>>>>>>>>>>       }
>>>>>>>>>>>       static const struct v4l2_subdev_core_ops isp4sd_core_ops = {
>>>>>>>>>>>
>>>>>>>>>>>> On the other hand, please also add the patch in amdgpu which sets *bo to
>>>>>>>>>>>> NULL in isp_kernel_buffer_alloc() which you mentioned in another thread.
>>>>>>>>>>>
>>>>>>>>>>> Yep, I have been doing all v5 testing with that patch applied.
>>>>>>>>>>>
>>>>>>>>>>> BTW, I have verified the IRQ changes are not the cause for the regression; I
>>>>>>>>>>> tested with IRQs kept enabled all the time and the issue still occurs.
>>>>>>>>>>>
>>>>>>>>>>> I did observe that ISP stops sending interrupts when the video stream freezes.
>>>>>>>>>>> And, if I replicate the bug enough times, it seems to permanently break FW until
>>>>>>>>>>> a full machine reboot. Reloading amd_capture with the v4 driver doesn't recover
>>>>>>>>>>> the ISP when this happens.
>>>>>>>>>>>
>>>>>>>>>>> As an improvement to the driver, can we do a hard reset of ISP on driver probe?
>>>>>>>>>>> I am assuming hardware doesn't need too long to settle after hard reset, maybe
>>>>>>>>>>> a few hundred milliseconds? This would ensure ISP FW is always in a working
>>>>>>>>>>> state when the driver is loaded.
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Actually, each time the camera is activated, the ISP driver powers on the
>>>>>>>>> ISP and initiates its firmware from the beginning; when the camera is
>>>>>>>>> closed, the ISP is powered off..
>>>>>>>>
>>>>>>>> Hmm, well I am able to put the ISP in a completely unusable state that doesn't
>>>>>>>> recover when I unload and reload amd_capture. Or maybe it is the sensor that is
>>>>>>>> stuck in a broken state?
>>>>>>>
>>>>>>> Here is the log output when I try to start cheese after unloading and reloading
>>>>>>> amd_capture, where the ISP is in the broken state that requires rebooting the
>>>>>>> laptop (annotated with notes of what I saw/did at each point in time):
>>>>>>>
>>>>>>> ==> opened cheese
>>>>>>> ==> cheese froze after a few seconds
>>>>>>> ==> closed cheese
>>>>>>>       [   34.570823] amd_isp_capture amd_isp_capture: fail to disable stream
>>>>>>>       [   35.077503] amd_isp_capture amd_isp_capture: fail to stop steam
>>>>>>>       [   35.077525] amd_isp_capture amd_isp_capture: disabling streaming failed (-110)
>>>>>>> ==> unloaded amd_capture
>>>>>>> ==> loaded amd_capture
>>>>>>> ==> opened cheese
>>>>>>>       [  308.039721] amd_isp_capture amd_isp_capture: ISP CCPU FW boot failed
>>>>>>>       [  308.043961] amd_isp_capture amd_isp_capture: fail to start isp_subdev interface
>>>>>>>       [  308.044188] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044194] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044196] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044197] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044198] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044198] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044199] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044200] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044200] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044201] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.044202] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.065226] amd_isp_capture amd_isp_capture: power up isp fail -22
>>>>>>>       [  308.174814] amd_isp_capture amd_isp_capture: ISP CCPU FW boot failed
>>>>>>>       [  308.177807] amd_isp_capture amd_isp_capture: fail to start isp_subdev interface
>>>>>>>       [  308.178036] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178043] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178044] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178045] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178046] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178047] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178048] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178048] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178049] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178050] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.178050] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.198776] amd_isp_capture amd_isp_capture: power up isp fail -22
>>>>>>>       [  308.306835] amd_isp_capture amd_isp_capture: ISP CCPU FW boot failed
>>>>>>>       [  308.311533] amd_isp_capture amd_isp_capture: fail to start isp_subdev interface
>>>>>>>       [  308.311723] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311723] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311724] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311725] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311725] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311726] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311726] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311726] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311727] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311727] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.311727] amd_isp_capture amd_isp_capture: invalid mem_info
>>>>>>>       [  308.335281] amd_isp_capture amd_isp_capture: power up isp fail -22
>>>>>>>
>>>>>>
>>>>>> Thanks Sultan for the detailed info, I agree with you, tend to believe it
>>>>>> may related to the sensor, I will follow up with the FW team to investigate
>>>>>> further.
>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Sultan
>>>>>>>>>>
>>>>>>>>>> A small update: I reproduced the issue on v4, but it took several more cycles of
>>>>>>>>>> closing/opening cheese and waiting 30s compared to v5.
>>>>>>>>>>
>>>>>>>>>> Right now my best guess is that this is a timing issue with respect to FW that
>>>>>>>>>> was exposed by the v5 changes. v5 introduced slight changes in code timing, like
>>>>>>>>>> with the mutex locks getting replaced by spin locks.
>>>>>>>>>>
>>>>>>>>>> I'll try to insert mdelays to see if I can expose the issue that way on v4.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Could you kindly provide the FW used?
>>>>>>>>
>>>>>>>>       commit a89515d3ff79f12099f7a51b0f4932b881b7720e
>>>>>>>>       Author: Pratap Nirujogi <pratap.nirujogi@....com>
>>>>>>>>       Date:   Thu Aug 21 15:40:45 2025 -0400
>>>>>>>>
>>>>>>>>           amdgpu: Update ISP FW for isp v4.1.1
>>>>>>>>           From internal git commit:
>>>>>>>>           24557b7326e539183b3bc44cf8e1496c74d383d6
>>>>>>>>           Signed-off-by: Pratap Nirujogi <pratap.nirujogi@....com>
>>>>>>>>
>>>>>>>> Sultan
>>>>>>>
>>>>>>> Sultan
>>>>>>
>>>>>> -- 
>>>>>> Regards,
>>>>>> Bin
>>>>>>
>>>>>
>>>>> Thanks, Bin. I looked deeper at the code and didn't find any reason the issue
>>>>> could be due to the driver. Also, the problem happens outside of cheese for me
>>>>> (like in Chromium with Zoom meetings), so v5 of the driver is pretty much
>>>>> unusable for me until this issue is fixed. :(
>>>>>
>>>>
>>>> Oh, really sad to hear that :(, there must be some difference between our
>>>> platforms because we still can't reproduce the issue you mentioned, to help
>>>> on this, would you share more info like your Ubuntu version, Kernel
>>>> version/source, ISP driver version, BIOS version, .config used to build the
>>>> kernel, FW: commit a89515d3ff79f12099f7a51b0f4932b881b7720e.
>>>> Just wondering, if possible, could you provide the kernel image either so we
>>>> can directly test on it. Also, the HW is not broken, right?
>>>
>>> I figured out why you cannot reproduce the issue. You need to pass amd_iommu=off
>>> on the kernel command line to trigger the issue.
>>>
>>> The reason I am using amd_iommu=off is because this laptop otherwise doesn't
>>> ever wake from suspend under Linux once it reaches the S0i3 state. The keyboard,
>>> power button, and lid do not respond to wake up the laptop from suspend. This
>>> happens 100% of the time once S0i3 is reached, and occurs on the OEM Ubuntu
>>> image from HP as well. The only fix I have found is to pass amd_iommu=off, which
>>> other owners of this laptop also found fixes the issue.
>>
>> You're the first report I've heard of this.
>>
>> Are you using storage encryption or a storage password of any kind (software
>> or hardware) by chance?
>>
>> If you are can you please describe it?
>>
>> Also can you generate a report using amd-s2idle?  I don't think it's going
>> to flag anything but I would like to confirm.
> 
> The issue is mentioned on a Reddit post [1]. A specific comment mentions
> amd_iommu=off fixing the issue [2], which is where I got the idea to do that.
> 
> You will find conflicting reports about this issue online, with some people
> saying it doesn't happen anymore after some kernel update [3], and others saying
> it worked until something updated [4].
> 
> The reason for all the conflicting reports online is because this issue only
> occurs when S0i3 is reached, and I suspect that no one realized there's a delay
> before S0i3 is entered. Reaching S0i3 appears to take at least 60 seconds
> *after* suspending the laptop. If S0i3 isn't entered, then you *can* wake the
> laptop but there will always be this message indicating S0i3 wasn't hit:
>    [  107.428836] amd_pmc AMDI000B:00: Last suspend didn't reach deepest state
> 

It shouldn't take 60 seconds to enter s0i3.  It should be ~5 seconds. 
So are you saying that if you have IOMMU enabled and interrupt the 
suspend around 20 seconds later you get that you didn't reach deepest 
sleep state, and if you wait longer it hangs?

> I am using LUKS1 encryption on my storage (software encryption). However, I'm
> not sure any of my configuration info is relevant because I reproduced the issue
> from a live USB using HP's OEM Ubuntu image [5], with nothing else physically
> plugged into the laptop and not connected to anything over WiFi or Bluetooth.

Is the SSD a SED?  Anything for storage password set in BIOS?

> 
> I had the thought of generating a report using amd-s2idle a couple months ago...
> except I have no way to wake the machine from suspend at all. I have to hold the
> the power button to do a hard shutdown. I tried using no_console_suspend but of
> course userspace processes are frozen so systemd couldn't record anything for
> me. I tried UART over USB and connected the output to another laptop but it
> would only work for a few seconds right after booting up the laptop (could've
> just been because I was using PL2303 serial converters, which aren't so great).
> 
> I have also tried several different combinations of settings toggled on/off in
> the BIOS setup menu, as well as resetting to the factory default values, without
> any change in behavior.
> 
> I'm at a loss on how I can retrieve some debug info for this issue. :/

You are on the latest BIOS presumably, right?

If you schedule a suspend with amd-s2idle for ~10 seconds, does it 
reproduce too?

> 
> [1] https://www.reddit.com/r/AMDLaptops/comments/1mmrlgz/hp_zbook_ultra_g1a_ubuntu_fully_working_now_or/
> [2] https://www.reddit.com/r/AMDLaptops/comments/1mmrlgz/comment/nd4cldp/
> [3] https://forum.level1techs.com/t/the-ultimate-arch-secureboot-guide-for-ryzen-ai-max-ft-hp-g1a-128gb-8060s-monster-laptop/230652#hibernate-suspend-and-kernel-versions-16
> [4] https://www.reddit.com/r/AMDLaptops/comments/1mmrlgz/comment/nd1xbtd/
> [5] https://ftp.hp.com/pub/softpaq/sp158501-159000/stella-noble-oem-24.04b-20250422-107.iso
> 
> Sultan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ