[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <852276a6-def4-45ee-97ef-cce7cf565ca0@intel.com>
Date: Tue, 24 Jun 2025 15:42:50 +0530
From: "Nilawar, Badal" <badal.nilawar@...el.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio@...el.com>,
<intel-xe@...ts.freedesktop.org>, <dri-devel@...ts.freedesktop.org>,
<linux-kernel@...r.kernel.org>
CC: <anshuman.gupta@...el.com>, <rodrigo.vivi@...el.com>,
<alexander.usyskin@...el.com>, <gregkh@...uxfoundation.org>, <jgg@...dia.com>
Subject: Re: [PATCH v3 08/10] drm/xe/xe_late_bind_fw: Introduce debug fs node
to disable late binding
On 23-06-2025 21:07, Daniele Ceraolo Spurio wrote:
>
>
> On 6/18/2025 11:51 PM, Nilawar, Badal wrote:
>>
>> On 19-06-2025 02:49, Daniele Ceraolo Spurio wrote:
>>>
>>>
>>> On 6/18/2025 12:00 PM, Badal Nilawar wrote:
>>>> Introduce a debug filesystem node to disable late binding fw reload
>>>> during the system or runtime resume. This is intended for situations
>>>> where the late binding fw needs to be loaded from user mode.
>>>
>>> You haven't replied to my question on the previous rev in regards to
>>> the expected use-case here.
>>> Is this for testing only, or something an actual user might want to
>>> do? If we only need this for testing, please specify so.
>>
>> Apologies for the oversight. Yes, this is only necessary for testing
>> the binary before releasing it for up-streaming. There is internal
>> tool which uses IGSC lib to download the binary. To avoid clash
>> between the binaries, this debug fs node is provided.
>>
>>>
>>> Also, what happens if we suspend with a user-loaded binary?
>>> userspace doesn't have visibility to know that they have to re-load
>>> their binary.
>>
>> If the device enters D3 cold state, the binary needs to be reloaded.
>> However, the kernel mode driver (KMD) does not have control over
>> binaries downloaded via the IGSC library.
>> If needed D3 cold can be disabled from BIOS or by setting up
>> vram_threshold = 0.
>
> I'm confused. Whatever the tool writes is lost on d3cold entry, does
> it make any difference to the tester if we revert to the default
> values or to the kernel-loaded table? It's still not what they've
> written. It sounds to me more like what you need is to block D3cold
> (and potentially S2/S3) when the UMD tool is used so that the
> userspace-provided table is not lost.
>
> Otherwise, we could add a modparam to override the binary like we have
> for the other firmwares and test anything new that way.
In user space late binding flow xe kmd doesn't need to participate. In
System suspend flow we will not know whether it will go to D3Cold or
not, so we will simply reload the binary during resume. Meanwhile as
discussed offline I will document that config will be lost on D3Cold entry.
Regards,
Badal
>
> Daniele
>
>>
>> Regards,
>> Badal
>>
>>> Daniele
>>>
>>>>
>>>> v2:
>>>> -s/(uval == 1) ? true : false/!!uval/ (Daniele)
>>>>
>>>> Signed-off-by: Badal Nilawar <badal.nilawar@...el.com>
>>>> ---
>>>> drivers/gpu/drm/xe/xe_debugfs.c | 41
>>>> ++++++++++++++++++++++
>>>> drivers/gpu/drm/xe/xe_late_bind_fw.c | 3 ++
>>>> drivers/gpu/drm/xe/xe_late_bind_fw_types.h | 3 ++
>>>> 3 files changed, 47 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c
>>>> b/drivers/gpu/drm/xe/xe_debugfs.c
>>>> index d83cd6ed3fa8..d1f6f556efa2 100644
>>>> --- a/drivers/gpu/drm/xe/xe_debugfs.c
>>>> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
>>>> @@ -226,6 +226,44 @@ static const struct file_operations
>>>> atomic_svm_timeslice_ms_fops = {
>>>> .write = atomic_svm_timeslice_ms_set,
>>>> };
>>>> +static ssize_t disable_late_binding_show(struct file *f, char
>>>> __user *ubuf,
>>>> + size_t size, loff_t *pos)
>>>> +{
>>>> + struct xe_device *xe = file_inode(f)->i_private;
>>>> + struct xe_late_bind *late_bind = &xe->late_bind;
>>>> + char buf[32];
>>>> + int len;
>>>> +
>>>> + len = scnprintf(buf, sizeof(buf), "%d\n", late_bind->disable);
>>>> +
>>>> + return simple_read_from_buffer(ubuf, size, pos, buf, len);
>>>> +}
>>>> +
>>>> +static ssize_t disable_late_binding_set(struct file *f, const char
>>>> __user *ubuf,
>>>> + size_t size, loff_t *pos)
>>>> +{
>>>> + struct xe_device *xe = file_inode(f)->i_private;
>>>> + struct xe_late_bind *late_bind = &xe->late_bind;
>>>> + u32 uval;
>>>> + ssize_t ret;
>>>> +
>>>> + ret = kstrtouint_from_user(ubuf, size, sizeof(uval), &uval);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + if (uval > 1)
>>>> + return -EINVAL;
>>>> +
>>>> + late_bind->disable = !!uval;
>>>> + return size;
>>>> +}
>>>> +
>>>> +static const struct file_operations disable_late_binding_fops = {
>>>> + .owner = THIS_MODULE,
>>>> + .read = disable_late_binding_show,
>>>> + .write = disable_late_binding_set,
>>>> +};
>>>> +
>>>> void xe_debugfs_register(struct xe_device *xe)
>>>> {
>>>> struct ttm_device *bdev = &xe->ttm;
>>>> @@ -249,6 +287,9 @@ void xe_debugfs_register(struct xe_device *xe)
>>>> debugfs_create_file("atomic_svm_timeslice_ms", 0600, root, xe,
>>>> &atomic_svm_timeslice_ms_fops);
>>>> + debugfs_create_file("disable_late_binding", 0600, root, xe,
>>>> + &disable_late_binding_fops);
>>>> +
>>>> for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1;
>>>> ++mem_type) {
>>>> man = ttm_manager_type(bdev, mem_type);
>>>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c
>>>> b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>>>> index c0be9611c73b..001e526e569a 100644
>>>> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
>>>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>>>> @@ -129,6 +129,9 @@ int xe_late_bind_fw_load(struct xe_late_bind
>>>> *late_bind)
>>>> if (!late_bind->component_added)
>>>> return -EINVAL;
>>>> + if (late_bind->disable)
>>>> + return 0;
>>>> +
>>>> for (fw_id = 0; fw_id < MAX_FW_ID; fw_id++) {
>>>> lbfw = &late_bind->late_bind_fw[fw_id];
>>>> if (lbfw->valid)
>>>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>>>> b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>>>> index d256f53d59e6..f79f0c0b2c4a 100644
>>>> --- a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>>>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>>>> @@ -71,6 +71,9 @@ struct xe_late_bind {
>>>> struct xe_late_bind_fw late_bind_fw[MAX_FW_ID];
>>>> /** @late_bind.wq: workqueue to submit request to download
>>>> late bind blob */
>>>> struct workqueue_struct *wq;
>>>> +
>>>> + /** @late_bind.disable to block late binding reload during pm
>>>> resume flow*/
>>>> + bool disable;
>>>> };
>>>> #endif
>>>
>
Powered by blists - more mailing lists