lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5ab44ff4-2b69-2c7b-9974-d86919d79346@amd.com>
Date: Wed, 14 Aug 2024 14:59:38 -0700
From: Lizhi Hou <lizhi.hou@....com>
To: Jeffrey Hugo <quic_jhugo@...cinc.com>, <ogabbay@...nel.org>,
	<dri-devel@...ts.freedesktop.org>
CC: <linux-kernel@...r.kernel.org>, <min.ma@....com>, <max.zhen@....com>,
	<sonal.santan@....com>, <king.tam@....com>, Narendra Gutta
	<VenkataNarendraKumar.Gutta@....com>, George Yang <George.Yang@....com>
Subject: Re: [PATCH V2 01/10] accel/amdxdna: Add a new driver for AMD AI
 Engine


On 8/14/24 14:53, Jeffrey Hugo wrote:
> On 8/14/2024 2:24 PM, Lizhi Hou wrote:
>>
>> On 8/14/24 11:46, Jeffrey Hugo wrote:
>>> On 8/14/2024 12:16 PM, Lizhi Hou wrote:
>>>>
>>>> On 8/9/24 09:11, Jeffrey Hugo wrote:
>>>>> On 8/5/2024 11:39 AM, Lizhi Hou wrote:
>>>>>> diff --git a/drivers/accel/amdxdna/aie2_pci.c 
>>>>>> b/drivers/accel/amdxdna/aie2_pci.c
>>>>>> new file mode 100644
>>>>>> index 000000000000..3660967c00e6
>>>>>> --- /dev/null
>>>>>> +++ b/drivers/accel/amdxdna/aie2_pci.c
>>>>>> @@ -0,0 +1,182 @@
>>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>>> +/*
>>>>>> + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc.
>>>>>> + */
>>>>>> +
>>>>>> +#include <linux/amd-iommu.h>
>>>>>> +#include <linux/errno.h>
>>>>>> +#include <linux/firmware.h>
>>>>>> +#include <linux/iommu.h>
>>>>>
>>>>> You are clearly missing linux/pci.h and I suspect many more.
>>>> Other headers are indirectly included by "aie2_pci.h" underneath.
>>>
>>> aie2_pci.h also does not directly include linux/pci.h
>>
>> it is aie2_pci.h --> amdxdna_pci_drv.h --> linux/pci.h.
>>
>> It compiles without any issue.
>
> I did not doubt that it compiled.  To be clear, I'm pointing out what 
> I believe is poor style - relying on implicit includes.
>
> There is no reason that amdxdna_pci_drv.h needs to include linux/pci.h 
> (at-least in this patch).  That header file doesn't use any of the 
> content from pci.h.  If this were merged, I suspect it would be valid 
> for someone to post a cleanup patch that removes pci.h from 
> amdxdna_pci_drv.h.
>
> The problem comes when someone does a tree wide refactor of some 
> header - perhaps moving a function out of pci.h into something else.  
> If they grep the source tree, they'll find amdxdna_pci_drv.h includes 
> pci.h but really doesn't use it.  They likely won't see aie2_pci.c 
> which may break because of the refactor.  Even more problematic is if 
> pci.h is including something that you need, and you are not including 
> it anywhere. The code will still compile, but maybe in the next kernel 
> cycle pci.h no longer includes that thing.  Your code will break.
>
> The 4 includes you have here seems entirely too little, and I'm not 
> clearly seeing the logic of what gets explicitly included vs what is 
> implicitly included.  firmware.h is explicitly included, but pci.h is 
> not, yet it seems like you use a lot more from pci.h.
>
> There is the include what you use project that attempts to automate 
> this, although I don't know how well it works with kernel code - 
> https://github.com/include-what-you-use/include-what-you-use

Ok. got your point and I will cleanup include.


Lizhi

>
>>
>>>
>>>>>> +
>>>>>> +    ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
>>>>>> +    if (ret) {
>>>>>> +        XDNA_ERR(xdna, "Failed to set DMA mask: %d", ret);
>>>>>> +        goto release_fw;
>>>>>> +    }
>>>>>> +
>>>>>> +    nvec = pci_msix_vec_count(pdev);
>>>>>
>>>>> This feels weird.  Can your device advertise variable number of 
>>>>> MSI-X vectors?  It only works if all of the vectors are used?
>>>> That is possible. the driver supports different hardware. And the 
>>>> fw assigns vector for hardware context dynamically. So the driver 
>>>> needs to allocate all vectors ahead.
>>>
>>> So, if the device requests N MSIs, but the host is only able to 
>>> satisfy 1 (or some number less than N), the fw is completely unable 
>>> to function?
>> The fw may return interrupt 2 is assigned to hardware context. Then 
>> the driver may not deal with it in this case. I think it is ok to 
>> fail if the system has very limited resource.
>
> Ok.  I suspect you'll want to change that behavior in the future with 
> a fw update, but if this is how things work today, then this is how 
> the driver must be.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ