lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8f4d543a-0b6d-65ca-59ef-70370fbfcb12@sangfor.com.cn>
Date:   Wed, 3 May 2023 21:31:42 +0800
From:   Ding Hui <dinghui@...gfor.com.cn>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     dinghui@...gfor.com.cn, bhelgaas@...gle.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com, vidyas@...dia.com,
        david.e.box@...ux.intel.com, kai.heng.feng@...onical.com,
        michael.a.bottini@...ux.intel.com, rajatja@...gle.com,
        qinzongquan@...gfor.com.cn, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI/ASPM: fix UAF by removing cached downstream

On 2023/5/2 12:52 上午, Bjorn Helgaas wrote:
> On Mon, May 01, 2023 at 11:50:50AM +0800, Ding Hui wrote:
>> On 2023/5/1 10:10, Bjorn Helgaas wrote:
>>> On Sat, Apr 29, 2023 at 09:26:04PM +0800, Ding Hui wrote:
>>>> If the function 0 of a multifunction device is removed, an freed
>>>> downstream pointer will be left in struct pcie_link_state, and then
>>>> when pcie_config_aspm_link() be invoked from any path, we will get a
>>>> KASAN use-after-free report.
>>>
>>> Thanks for finding this problem, debugging it, and the patch!
>>>
>>> In this case we're doing a "software remove" and the other functions
>>> are still present, right?  It's kind of annoying that there's only one
>>> link, but all the functions of a multifunction device have a Link
>>> Control register, and the spec "recommends" that software program the
>>> same ASPM control value for all the functions.
>>
>> Yes, that is the case.
>>
>>> The hardware of course doesn't know anything about this software
>>> remove; all the functions are still physically present and powered up.
>>>
>>> That makes me think that if software ignores the "removed" function
>>> and continues to operate ASPM on the N-1 remaining functions, we're
>>> outside the spec recommendations because the ASPM configuration is no
>>> longer the same across all the functions.
>>>
>>> So my inclination would be disable ASPM completely when any function
>>> of a multi-function device is removed.  What are your thoughts on
>>> this?
>>
>> Agree with you.
>>
>> Previously, I thought another fix that was if the function 0 is removed,
>> we can free the link state to disable ASPM for this link.
>>
>> Now following you suggestion, it can be expanded to any child function.
>>
>> How about fixing like this?
>>
>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> index 66d7514ca111..657e0647d19f 100644
>> --- a/drivers/pci/pcie/aspm.c
>> +++ b/drivers/pci/pcie/aspm.c
>> @@ -1011,12 +1011,11 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev)
>>   	down_read(&pci_bus_sem);
>>   	mutex_lock(&aspm_lock);
>>   	/*
>> -	 * All PCIe functions are in one slot, remove one function will remove
>> -	 * the whole slot, so just wait until we are the last function left.
>> +	 * All PCIe functions are in one slot.
>> +	 * The spec "recommends" that software program set the same ASPM control
>> +	 * value for all the functions.
>> +	 * Disable ASPM when any child function is removed.
> 
> Since we're updating the comment anyway, let's clean up the "slot"
> language here.  The PCIe spec doesn't use "slot" in the context of the
> bus/device/function PCIe topology; it only uses it when referring to a
> physical connector where a card might be installed.  What we want here
> is "Device," and then we have to consider whether ARI makes any
> difference here.
> 
> The spec says (referring to ASPM Control):
> 
>    For Multi-Function Devices (including ARI Devices), it is
>    recommended that software program the same value for this field in
>    all Functions. For non-ARI Multi-Function Devices, only capabilities
>    enabled in all Functions are enabled for the component as a whole.
> 
>    For ARI Devices, ASPM Control is determined solely by the setting in
>    Function 0, regardless of Function 0’s D-state. The settings in the
>    other Functions always return whatever value software programmed for
>    each, but otherwise are ignored by the component.
> 
> A spec reference, e.g., "PCIe r6.0, sec 7.5.3.7", would be good here.
> 
> Anyway, I think the idea of "software removing" a single function is
> kind of a niche situation that we don't need to worry about
> optimizing, and I think turning off ASPM completely will avoid a lot
> of weird corner cases.
> 

Thanks for details.

I'll redo the patch with different title.

>>   	 */
>> -	if (!list_empty(&parent->subordinate->devices))
>> -		goto out;
>> -
>>   	link = parent->link_state;
>>   	root = link->root;
>>   	parent_link = link->parent;
>>
>>
>> -- 
>> Thanks,
>> -dinghui
>>
> 

-- 
Thanks,
-dinghui

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ