[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4b582be9-74e7-4175-8528-59f8e0bd120d@intel.com>
Date: Mon, 23 Jun 2025 08:37:31 -0700
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@...el.com>
To: "Nilawar, Badal" <badal.nilawar@...el.com>,
<intel-xe@...ts.freedesktop.org>, <dri-devel@...ts.freedesktop.org>,
<linux-kernel@...r.kernel.org>
CC: <anshuman.gupta@...el.com>, <rodrigo.vivi@...el.com>,
<alexander.usyskin@...el.com>, <gregkh@...uxfoundation.org>, <jgg@...dia.com>
Subject: Re: [PATCH v3 08/10] drm/xe/xe_late_bind_fw: Introduce debug fs node
to disable late binding
On 6/18/2025 11:51 PM, Nilawar, Badal wrote:
>
> On 19-06-2025 02:49, Daniele Ceraolo Spurio wrote:
>>
>>
>> On 6/18/2025 12:00 PM, Badal Nilawar wrote:
>>> Introduce a debug filesystem node to disable late binding fw reload
>>> during the system or runtime resume. This is intended for situations
>>> where the late binding fw needs to be loaded from user mode.
>>
>> You haven't replied to my question on the previous rev in regards to
>> the expected use-case here.
>> Is this for testing only, or something an actual user might want to
>> do? If we only need this for testing, please specify so.
>
> Apologies for the oversight. Yes, this is only necessary for testing
> the binary before releasing it for up-streaming. There is internal
> tool which uses IGSC lib to download the binary. To avoid clash
> between the binaries, this debug fs node is provided.
>
>>
>> Also, what happens if we suspend with a user-loaded binary? userspace
>> doesn't have visibility to know that they have to re-load their binary.
>
> If the device enters D3 cold state, the binary needs to be reloaded.
> However, the kernel mode driver (KMD) does not have control over
> binaries downloaded via the IGSC library.
> If needed D3 cold can be disabled from BIOS or by setting up
> vram_threshold = 0.
I'm confused. Whatever the tool writes is lost on d3cold entry, does it
make any difference to the tester if we revert to the default values or
to the kernel-loaded table? It's still not what they've written. It
sounds to me more like what you need is to block D3cold (and potentially
S2/S3) when the UMD tool is used so that the userspace-provided table is
not lost.
Otherwise, we could add a modparam to override the binary like we have
for the other firmwares and test anything new that way.
Daniele
>
> Regards,
> Badal
>
>> Daniele
>>
>>>
>>> v2:
>>> -s/(uval == 1) ? true : false/!!uval/ (Daniele)
>>>
>>> Signed-off-by: Badal Nilawar <badal.nilawar@...el.com>
>>> ---
>>> drivers/gpu/drm/xe/xe_debugfs.c | 41
>>> ++++++++++++++++++++++
>>> drivers/gpu/drm/xe/xe_late_bind_fw.c | 3 ++
>>> drivers/gpu/drm/xe/xe_late_bind_fw_types.h | 3 ++
>>> 3 files changed, 47 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c
>>> b/drivers/gpu/drm/xe/xe_debugfs.c
>>> index d83cd6ed3fa8..d1f6f556efa2 100644
>>> --- a/drivers/gpu/drm/xe/xe_debugfs.c
>>> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
>>> @@ -226,6 +226,44 @@ static const struct file_operations
>>> atomic_svm_timeslice_ms_fops = {
>>> .write = atomic_svm_timeslice_ms_set,
>>> };
>>> +static ssize_t disable_late_binding_show(struct file *f, char
>>> __user *ubuf,
>>> + size_t size, loff_t *pos)
>>> +{
>>> + struct xe_device *xe = file_inode(f)->i_private;
>>> + struct xe_late_bind *late_bind = &xe->late_bind;
>>> + char buf[32];
>>> + int len;
>>> +
>>> + len = scnprintf(buf, sizeof(buf), "%d\n", late_bind->disable);
>>> +
>>> + return simple_read_from_buffer(ubuf, size, pos, buf, len);
>>> +}
>>> +
>>> +static ssize_t disable_late_binding_set(struct file *f, const char
>>> __user *ubuf,
>>> + size_t size, loff_t *pos)
>>> +{
>>> + struct xe_device *xe = file_inode(f)->i_private;
>>> + struct xe_late_bind *late_bind = &xe->late_bind;
>>> + u32 uval;
>>> + ssize_t ret;
>>> +
>>> + ret = kstrtouint_from_user(ubuf, size, sizeof(uval), &uval);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + if (uval > 1)
>>> + return -EINVAL;
>>> +
>>> + late_bind->disable = !!uval;
>>> + return size;
>>> +}
>>> +
>>> +static const struct file_operations disable_late_binding_fops = {
>>> + .owner = THIS_MODULE,
>>> + .read = disable_late_binding_show,
>>> + .write = disable_late_binding_set,
>>> +};
>>> +
>>> void xe_debugfs_register(struct xe_device *xe)
>>> {
>>> struct ttm_device *bdev = &xe->ttm;
>>> @@ -249,6 +287,9 @@ void xe_debugfs_register(struct xe_device *xe)
>>> debugfs_create_file("atomic_svm_timeslice_ms", 0600, root, xe,
>>> &atomic_svm_timeslice_ms_fops);
>>> + debugfs_create_file("disable_late_binding", 0600, root, xe,
>>> + &disable_late_binding_fops);
>>> +
>>> for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1;
>>> ++mem_type) {
>>> man = ttm_manager_type(bdev, mem_type);
>>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c
>>> b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>>> index c0be9611c73b..001e526e569a 100644
>>> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
>>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>>> @@ -129,6 +129,9 @@ int xe_late_bind_fw_load(struct xe_late_bind
>>> *late_bind)
>>> if (!late_bind->component_added)
>>> return -EINVAL;
>>> + if (late_bind->disable)
>>> + return 0;
>>> +
>>> for (fw_id = 0; fw_id < MAX_FW_ID; fw_id++) {
>>> lbfw = &late_bind->late_bind_fw[fw_id];
>>> if (lbfw->valid)
>>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>>> b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>>> index d256f53d59e6..f79f0c0b2c4a 100644
>>> --- a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>>> @@ -71,6 +71,9 @@ struct xe_late_bind {
>>> struct xe_late_bind_fw late_bind_fw[MAX_FW_ID];
>>> /** @late_bind.wq: workqueue to submit request to download
>>> late bind blob */
>>> struct workqueue_struct *wq;
>>> +
>>> + /** @late_bind.disable to block late binding reload during pm
>>> resume flow*/
>>> + bool disable;
>>> };
>>> #endif
>>
Powered by blists - more mailing lists