lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1849523e-fff1-40e3-9d96-ae389cca6078@arm.com>
Date: Fri, 17 Oct 2025 11:46:49 +0200
From: Ketil Johnsen <ketil.johnsen@....com>
To: Boris Brezillon <boris.brezillon@...labora.com>
Cc: Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/panthor: Fix UAF race between device unplug and FW
 event processing

On 09/10/2025 11:45, Boris Brezillon wrote:
> On Wed,  8 Oct 2025 12:53:20 +0200
> Ketil Johnsen <ketil.johnsen@....com> wrote:
> 
>> The function panthor_fw_unplug() will free the FW memory sections.
>> The problem is that there could still be pending FW events which are yet
>> not handled at this point. process_fw_events_work() can in this case try
>> to access said freed memory.
>>
>> The fix is to stop FW event processing after IRQs are disabled but before
>> the FW memory is freed.
>>
>> Signed-off-by: Ketil Johnsen <ketil.johnsen@....com>
>> ---
>>   drivers/gpu/drm/panthor/panthor_fw.c    |  3 +++
>>   drivers/gpu/drm/panthor/panthor_sched.c | 12 ++++++++++++
>>   drivers/gpu/drm/panthor/panthor_sched.h |  1 +
>>   3 files changed, 16 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
>> index 9bf06e55eaee..4f393c5cd26f 100644
>> --- a/drivers/gpu/drm/panthor/panthor_fw.c
>> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
>> @@ -1172,6 +1172,9 @@ void panthor_fw_unplug(struct panthor_device *ptdev)
>>   		panthor_fw_stop(ptdev);
>>   	}
>>   
>> +	/* Any pending FW event processing must stop before we free FW memory */
>> +	panthor_sched_stop_fw_events(ptdev);
>> +
>>   	list_for_each_entry(section, &ptdev->fw->sections, node)
>>   		panthor_kernel_bo_destroy(section->mem);
>>   
>> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
>> index 0cc9055f4ee5..d150c8d99432 100644
>> --- a/drivers/gpu/drm/panthor/panthor_sched.c
>> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
>> @@ -1794,6 +1794,18 @@ void panthor_sched_report_fw_events(struct panthor_device *ptdev, u32 events)
>>   	sched_queue_work(ptdev->scheduler, fw_events);
>>   }
>>   
>> +/**
>> + * panthor_sched_stop_fw_events() - Stop processing FW events.
>> + */
>> +void panthor_sched_stop_fw_events(struct panthor_device *ptdev)
>> +{
>> +	if (!ptdev->scheduler)
>> +		return;
>> +
>> +	atomic_set(&ptdev->scheduler->fw_events, 0);
>> +	cancel_work_sync(&ptdev->scheduler->fw_events_work);
>> +}
> 
> Hm, I'd rather have this called from sched_unplug() and then have an
> extra check in panthor_sched_report_fw_events() to bail out if the
> scheduler component is no longer functional. This way this helper stays
> private to panthor_sched.c.

A heads up on this from me:

I found a new race in the driver, somewhat similar to this one, as I was 
trying your suggested approach here. Simply put, panthor_device_unplug() 
calls drm_dev_unplug() at a time where there could be a running 
panthor_device_suspend(). This causes the suspend routine to skip a lot 
of work, for instance skip sync with running IRQ handlers, and boom, IRQ 
handlers will access a powered off GPU.

I will (most likely) push a v2 with both races fixed, as they both 
relate to interrupt processing while GPU is off.

> 
>> +
>>   static const char *fence_get_driver_name(struct dma_fence *fence)
>>   {
>>   	return "panthor";
>> diff --git a/drivers/gpu/drm/panthor/panthor_sched.h b/drivers/gpu/drm/panthor/panthor_sched.h
>> index f4a475aa34c0..4393599ed330 100644
>> --- a/drivers/gpu/drm/panthor/panthor_sched.h
>> +++ b/drivers/gpu/drm/panthor/panthor_sched.h
>> @@ -51,6 +51,7 @@ void panthor_sched_resume(struct panthor_device *ptdev);
>>   
>>   void panthor_sched_report_mmu_fault(struct panthor_device *ptdev);
>>   void panthor_sched_report_fw_events(struct panthor_device *ptdev, u32 events);
>> +void panthor_sched_stop_fw_events(struct panthor_device *ptdev);
>>   
>>   void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile);
>>   
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ