[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c19683a2-5a98-4bc8-8584-77ff32860792@amd.com>
Date: Mon, 10 Nov 2025 11:26:07 -0500
From: "Nirujogi, Pratap" <pnirujog@....com>
To: "Du, Bin" <bin.du@....com>, Sultan Alsawaf <sultan@...neltoast.com>
Cc: Mario Limonciello <mario.limonciello@....com>, pratap.nirujogi@....com,
amd-gfx@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
alexander.deucher@....com, benjamin.chan@....com, christian.koenig@....com,
dantony@....com, gjorgji.rosikopulos@....com, king.li@....com,
phil.jawich@....com
Subject: Re: [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates
a new BO
Hi Bin,
Yes, with this change applied, its not necessary to explicitly
initialize the kernel bo handles in ISP driver. We can continue using
kmalloc() in isp4if_gpu_mem_alloc() of ISP driver.
Thanks,
Pratap
On 11/9/2025 10:12 PM, Du, Bin wrote:
> Thans Sultan & Pratap,
>
> So, based on the discussion, the ISP driver will retain its current
> implementation and won’t do unnecessary init to *bo before calling
> isp_kernel_buffer_alloc().
>
> On 11/7/2025 4:51 AM, Nirujogi, Pratap wrote:
>>
>>
>> On 11/6/2025 3:31 PM, Sultan Alsawaf wrote:
>>> Caution: This message originated from an External Source. Use proper
>>> caution when opening attachments, clicking links, or responding.
>>>
>>>
>>> On Thu, Nov 06, 2025 at 03:08:44PM -0500, Nirujogi, Pratap wrote:
>>>>
>>>>
>>>> On 11/6/2025 1:58 PM, Sultan Alsawaf wrote:
>>>>> Caution: This message originated from an External Source. Use
>>>>> proper caution when opening attachments, clicking links, or
>>>>> responding.
>>>>>
>>>>>
>>>>> On Thu, Nov 06, 2025 at 12:46:51PM -0600, Mario Limonciello wrote:
>>>>>> On 11/5/25 5:21 PM, Sultan Alsawaf wrote:
>>>>>>> From: Sultan Alsawaf <sultan@...neltoast.com>
>>>>>>>
>>>>>>> When the BO pointer provided to amdgpu_bo_create_kernel() points to
>>>>>>> non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin
>>>>>>> that address
>>>>>>> rather than allocate a new BO.
>>>>>>>
>>>>>>> This functionality is never desired for allocating ISP buffers. A
>>>>>>> new BO
>>>>>>> should always be created when isp_kernel_buffer_alloc() is
>>>>>>> called, per the
>>>>>>> description for isp_kernel_buffer_alloc().
>>>>>>
>>>>>> Are you just going off descriptions, or is there a problem with
>>>>>> reuse?
>>>>>
>>>>> I am going based off the ISP4 driver's usage of
>>>>> isp_kernel_buffer_alloc().
>>>>>
>>>>> This fixes the following crash I experienced on v5 of the ISP4
>>>>> patchset:
>>>>>
>>>>> [ 175.485627] BUG: unable to handle page fault for address:
>>>>> 00007f6b1092e888
>>>>> [ 175.485799] #PF: supervisor read access in kernel mode
>>>>> [ 175.485840] #PF: error_code(0x0000) - not-present page
>>>>> [ 175.485869] PGD 0 P4D 0
>>>>> [ 175.485889] Oops: Oops: 0000 [#1] SMP
>>>>> [ 175.485908] CPU: 15 UID: 1000 PID: 57022 Comm: cheese
>>>>> Tainted: G U W 6.17.7 #1 PREEMPT
>>>>> [ 175.485921] Tainted: [U]=USER, [W]=WARN
>>>>> [ 175.485933] Hardware name: HP HP ZBook Ultra G1a 14 inch
>>>>> Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.00 04/25/2025
>>>>> [ 175.485943] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0
>>>>> [amdgpu]
>>>>> [ 175.485961] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44
>>>>> 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6
>>>>> 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6
>>>>> 48 89 04 24 e8 d6
>>>>> [ 175.485976] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202
>>>>> [ 175.485988] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX:
>>>>> 0000000000000000
>>>>> [ 175.486002] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI:
>>>>> ffff97b14e097b20
>>>>> [ 175.486012] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09:
>>>>> ffff97b085af04c8
>>>>> [ 175.486023] R10: 0000000000ffffff R11: ffff97b085af0560 R12:
>>>>> 0000000000000002
>>>>> [ 175.486031] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15:
>>>>> ffff97b0a0f00000
>>>>> [ 175.486037] FS: 00007faf60ffe6c0(0000)
>>>>> GS:ffff97cfa7133000(0000) knlGS:0000000000000000
>>>>> [ 175.486046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 175.486058] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4:
>>>>> 0000000000f50ef0
>>>>> [ 175.486067] PKRU: 55555554
>>>>> [ 175.486072] Call Trace:
>>>>> [ 175.486081] <TASK>
>>>>> [ 175.486092] ? smu_cmn_send_smc_msg_with_param+0xc0/0x360
>>>>> [amdgpu]
>>>>> [ 175.486102] amdgpu_bo_create_kernel+0x15/0x70 [amdgpu]
>>>>> [ 175.486110] isp_kernel_buffer_alloc+0x56/0xa0 [amdgpu]
>>>>> [ 175.486118] isp4if_gpu_mem_alloc.isra.0+0x45/0x70
>>>>> [amd_capture]
>>>>> [ 175.486126] isp4if_start+0x3a/0x320 [amd_capture]
>>>>> [ 175.486141] isp4sd_set_power+0x96/0x1e0 [amd_capture]
>>>>> [ 175.486148] pipeline_pm_power_one+0xf2/0x110 [videodev]
>>>>> [ 175.486155] pipeline_pm_power+0x51/0x90 [videodev]
>>>>> [ 175.486161] v4l2_pipeline_pm_use+0x3b/0x60 [videodev]
>>>>> [ 175.486169] isp4vid_qops_start_streaming+0x22/0x140
>>>>> [amd_capture]
>>>>> [ 175.486176] ? isp4vid_qops_buffer_queue+0xb1/0x140
>>>>> [amd_capture]
>>>>> [ 175.486185] vb2_start_streaming+0x79/0x140 [videobuf2_common]
>>>>> [ 175.486192] vb2_core_qbuf+0x3b5/0x480 [videobuf2_common]
>>>>> [ 175.486200] vb2_qbuf+0x98/0x100 [videobuf2_v4l2]
>>>>> [ 175.486208] __video_do_ioctl+0x480/0x4b0 [videodev]
>>>>> [ 175.486219] video_usercopy+0x1e5/0x760 [videodev]
>>>>> [ 175.486226] ? v4l_s_output+0x50/0x50 [videodev]
>>>>> [ 175.486237] v4l2_ioctl+0x45/0x50 [videodev]
>>>>> [ 175.486245] __x64_sys_ioctl+0x77/0xc0
>>>>> [ 175.486251] do_syscall_64+0x48/0xbf0
>>>>> [ 175.486260] entry_SYSCALL_64_after_hwframe+0x50/0x58
>>>>> [ 175.486268] RIP: 0033:0x7fb03371674d
>>>>> [ 175.486275] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d
>>>>> 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8
>>>>> 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b
>>>>> 04 25 28 00 00 00
>>>>> [ 175.486282] RSP: 002b:00007faf60ffc9d0 EFLAGS: 00000246
>>>>> ORIG_RAX: 0000000000000010
>>>>> [ 175.486292] RAX: ffffffffffffffda RBX: 00007fafb40050b0 RCX:
>>>>> 00007fb03371674d
>>>>> [ 175.486301] RDX: 00007faf60ffca70 RSI: 00000000c058560f RDI:
>>>>> 000000000000002c
>>>>> [ 175.486306] RBP: 00007faf60ffca20 R08: 13455f273d2513b9 R09:
>>>>> 0000000000000210
>>>>> [ 175.486313] R10: 0000000000000215 R11: 0000000000000246 R12:
>>>>> 00007faf9801c4b0
>>>>> [ 175.486318] R13: 0000000000000001 R14: 00007faf60ffcad0 R15:
>>>>> 0000000000000001
>>>>> [ 175.486324] </TASK>
>>>>> [ 175.486330] Modules linked in: ccm hid_sensor_prox
>>>>> hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer
>>>>> kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm
>>>>> snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture
>>>>> videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc
>>>>> pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash
>>>>> algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat
>>>>> snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn
>>>>> snd_acp70 snd_acp_i2s snd_acp_pdm joydev snd_soc_dmic snd_acp_pcm
>>>>> mousedev intel_rapl_msr snd_sof_amd_acp70 snd_sof_amd_acp63
>>>>> snd_hda_scodec_cs35l56_spi intel_rapl_common snd_ctl_led
>>>>> snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269
>>>>> snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp
>>>>> snd_hda_codec_realtek_lib snd_sof_pci snd_sof_xtensa_dsp
>>>>> snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps
>>>>> snd_soc_acpi_amd_match amdgpu snd_amd_sdw_acpi
>>>>> snd_hda_codec_atihdmi soundwire_amd soundwire_generic_allocation
>>>>> snd_hda_codec_hdmi
>>>>> [ 175.486373] soundwire_bus snd_soc_sdca snd_soc_core
>>>>> snd_compress snd_hda_intel ac97_bus snd_pcm_dmaengine mt7925e
>>>>> drm_panel_backlight_quirks amdxcp snd_hda_codec snd_rpl_pci_acp6x
>>>>> mt7925_common btusb drm_buddy cdc_mbim mt792x_lib snd_acp_pci
>>>>> cdc_wdm btrtl drm_exec snd_hda_core snd_amd_acpi_mach
>>>>> mt76_connac_lib snd_hda_scodec_cs35l56_i2c btintel
>>>>> snd_acp_legacy_common drm_suballoc_helper cdc_ncm snd_intel_dspcfg
>>>>> mt76 snd_hda_scodec_cs35l56 snd_pci_acp6x cdc_ether drm_ttm_helper
>>>>> btbcm snd_intel_sdw_acpi snd_hda_cirrus_scodec snd_hwdep usbnet ttm
>>>>> snd_pci_acp5x btmtk snd_soc_cs35l56_shared ucsi_acpi kvm_amd
>>>>> mac80211 snd_pcm r8152 i2c_algo_bit snd_rn_pci_acp3x typec_ucsi
>>>>> snd_soc_cs_amp_lib libarc4 spd5118 bluetooth mii drm_display_helper
>>>>> snd_timer cs_dsp kvm typec snd_acp_config hp_wmi snd cfg80211
>>>>> libphy amdxdna roles cec snd_soc_acpi ecdh_generic sp5100_tco
>>>>> hid_multitouch irqbypass sparse_keymap rfkill rapl mdio_bus video
>>>>> gpu_sched amd_pmf wmi_bmof snd_pci_acp3x soundcore amdtee
>>>>> i2c_hid_acpi serial_multi_instantiate
>>>>> [ 175.486384] i2c_hid amd_sfh thunderbolt wireless_hotkey
>>>>> amd_pmc platform_profile wmi mac_hid i2c_piix4 i2c_smbus i2c_dev sg
>>>>> crypto_user loop nfnetlink ip_tables x_tables dm_crypt
>>>>> encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni
>>>>> ghash_clmulni_intel aesni_intel nvme nvme_core nvme_keyring
>>>>> serio_raw ccp nvme_auth
>>>>> [ 175.486394] CR2: 00007f6b1092e888
>>>>> [ 175.486402] ---[ end trace 0000000000000000 ]---
>>>>> [ 175.486409] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0
>>>>> [amdgpu]
>>>>> [ 175.486416] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44
>>>>> 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6
>>>>> 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6
>>>>> 48 89 04 24 e8 d6
>>>>> [ 175.486422] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202
>>>>> [ 175.486429] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX:
>>>>> 0000000000000000
>>>>> [ 175.486433] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI:
>>>>> ffff97b14e097b20
>>>>> [ 175.486439] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09:
>>>>> ffff97b085af04c8
>>>>> [ 175.486444] R10: 0000000000ffffff R11: ffff97b085af0560 R12:
>>>>> 0000000000000002
>>>>> [ 175.486448] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15:
>>>>> ffff97b0a0f00000
>>>>> [ 175.486455] FS: 00007faf60ffe6c0(0000)
>>>>> GS:ffff97cfa7133000(0000) knlGS:0000000000000000
>>>>> [ 175.486460] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 175.486464] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4:
>>>>> 0000000000f50ef0
>>>>> [ 175.486470] PKRU: 55555554
>>>>>
>>>>> Admittedly, it's my fault that ISP4 stopped zeroing the BO pointer
>>>>> argument
>>>>> (&mem_info->mem_handle) passed to isp_kernel_buffer_alloc() [1].
>>>>> But since this
>>>>> issue slipped past Bin and presumably others who reviewed the code, it
>>>>> highlights that isp_kernel_buffer_alloc() is not working as
>>>>> expected in this
>>>>> respect, and the description for isp_kernel_buffer_alloc()
>>>>> reinforces this.
>>>>>
>>>> Thanks Sultan for suggesting this fix. Yes, its hard to believe that
>>>> this
>>>> slipped through until now.
>>>>
>>>> Instead of initializing *bo=NULL, I suggest ensuring *bo is actually
>>>> NULL
>>>> before calling amdgpu_bo_create_kernel().
>>>>
>>>> int isp_kernel_buffer_alloc(...)
>>>> {
>>>> ...
>>>> if (WARN_ON(*bo))
>>>> return -EINVAL;
>>>> ...
>>>> ret = amdgpu_bo_create_kernel(..)
>>>> ...
>>>> }
>>>>
>>>> This way the caller will get to know parameters passed are invalid
>>>> and can
>>>> take care of initializing the handles appropriately.
>>>
>>> Hi Pratap,
>>>
>>> I am opposed to this idea for multiple reasons:
>>>
>>> 1. *bo is an output parameter from isp_kernel_buffer_alloc(), so the
>>> input value
>>> should not matter.
>>>
>>> 2. The only correct value for *bo is NULL when
>>> isp_kernel_buffer_alloc() passes
>>> it to amdgpu_bo_create_kernel(). Since there is only one correct
>>> value, there
>>> is no reason to burden callers of isp_kernel_buffer_alloc() with
>>> intializing
>>> *bo to NULL.
>>>
>>> 3. This adds more code, another WARN_ON(), and another error case to
>>> isp_kernel_buffer_alloc(). All of that can be eliminated entirely
>>> by just
>>> initializing *bo to NULL in isp_kernel_buffer_alloc(), as I've done.
>>>
>>> 4. The reason *bo needs to be NULL is an implementation detail that
>>> is internal
>>> to isp_kernel_buffer_alloc(), because amdgpu_bo_create_kernel()
>>> needs it to
>>> be NULL in order to allocate a new buffer.
>>> isp_kernel_buffer_alloc() should
>>> handle its own internal implementation details instead of punting
>>> the
>>> responsibility onto callers.
>>>
>>> Sultan
>>>
>> Either approach works for me. My main intention is to ensure the
>> caller passes the BO handle initialized to NULL in this case. That
>> said, initializing *bo = NULL as you explained is perfectly fine too.
>>
>> Reviewed-by: Pratap Nirujogi <pratap.nirujogi@....com>
>>
>>>>
>>>> Thanks,
>>>> Pratap
>>>>
>>>>>>>
>>>>>>> Ensure this by zeroing *bo right before the
>>>>>>> amdgpu_bo_create_kernel() call.
>>>>>>>
>>>>>>> Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for
>>>>>>> isp buffers")
>>>>>>> Signed-off-by: Sultan Alsawaf <sultan@...neltoast.com>
>>>>>>> ---
>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++
>>>>>>> 1 file changed, 2 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/
>>>>>>> gpu/drm/amd/amdgpu/amdgpu_isp.c
>>>>>>> index 9cddbf50442a..37270c4dab8d 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c
>>>>>>> @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device
>>>>>>> *dev, u64 size,
>>>>>>> if (ret)
>>>>>>> return ret;
>>>>>>> + /* Ensure *bo is NULL so a new BO will be created */
>>>>>>> + *bo = NULL;
>>>>>>> ret = amdgpu_bo_create_kernel(adev,
>>>>>>> size,
>>>>>>> ISP_MC_ADDR_ALIGN,
>>>>>>
>>>>>
>>>>> [1] https://lore.kernel.org/all/aNB0P18ytI1KopWI@sultan-box/
>>>>>
>>>>> Sultan
>>>>
>>
>
Powered by blists - more mailing lists