[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <91d6756d-8a0d-d230-7deb-3c6d6090f746@gmail.com>
Date: Thu, 10 Sep 2020 14:23:44 -0700
From: Florian Fainelli <f.fainelli@...il.com>
To: Oded Gabbay <oded.gabbay@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>,
"Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>,
netdev@...r.kernel.org, SW_Drivers <SW_Drivers@...ana.ai>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH 00/15] Adding GAUDI NIC code to habanalabs driver
On 9/10/2020 2:15 PM, Oded Gabbay wrote:
> On Fri, Sep 11, 2020 at 12:05 AM Florian Fainelli <f.fainelli@...il.com> wrote:
>>
>>
>>
>> On 9/10/2020 1:32 PM, Oded Gabbay wrote:
>>> On Thu, Sep 10, 2020 at 11:28 PM Jakub Kicinski <kuba@...nel.org> wrote:
>>>>
>>>> On Thu, 10 Sep 2020 23:16:22 +0300 Oded Gabbay wrote:
>>>>> On Thu, Sep 10, 2020 at 11:01 PM Jakub Kicinski <kuba@...nel.org> wrote:
>>>>>> On Thu, 10 Sep 2020 19:11:11 +0300 Oded Gabbay wrote:
>>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.c
>>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_dcbnl.c
>>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_debugfs.c
>>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_ethtool.c
>>>>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_phy.c
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_masks.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm1_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_masks.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc1_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxb_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_masks.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe1_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_stat_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_tmr_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_masks.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe1_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_masks.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs1_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm0_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm1_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm0_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm1_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm0_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm1_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm0_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm1_regs.h
>>>>>>> create mode 100644 drivers/misc/habanalabs/include/hw_ip/nic/nic_general.h
>>>>>>
>>>>>> The relevant code needs to live under drivers/net/(ethernet/).
>>>>>> For one thing our automation won't trigger for drivers in random
>>>>>> (/misc) part of the tree.
>>>>>
>>>>> Can you please elaborate on how to do this with a single driver that
>>>>> is already in misc ?
>>>>> As I mentioned in the cover letter, we are not developing a
>>>>> stand-alone NIC. We have a deep-learning accelerator with a NIC
>>>>> interface.
>>>>> Therefore, we don't have a separate PCI physical function for the NIC
>>>>> and I can't have a second driver registering to it.
>>>>
>>>> Is it not possible to move the files and still build them into a single
>>>> module?
>>> hmm...
>>> I actually didn't try that as I thought it will be very strange and
>>> I'm not familiar with other drivers that build as a single ko but have
>>> files spread out in different subsystems.
>>> I don't feel it is a better option than what we did here.
>>>
>>> Will I need to split pull requests to different subsystem maintainers
>>> ? For the same driver ?
>>> Sounds to me this is not going to fly.
>>
>> Not necessarily, you can post your patches to all relevant lists and
>> seek maintainer review/acked-by tags from the relevant maintainers. This
>> is not unheard of with mlx5 for instance.
> Yeah, I see what you are saying, the problem is that sometimes,
> because everything is tightly integrated in our SOC, the patches
> contain code from common code (common to ALL our ASICs, even those who
> don't have NIC at all), GAUDI specific code which is not NIC related
> and the NIC code itself.
> But I guess that as a last resort if this is a *must* I can do that.
> Though I would like to hear Greg's opinion on this as he is my current
> maintainer.
>
> Personally I do want to send relevant patches to netdev because I want
> to get your expert reviews on them, but still keep the code in a
> single location.
We do have network drivers sprinkled across the kernel tree already, but
I would agree that from a networking maintainer perspective this makes
auditing code harder, you would naturally grep for net/ and drivers/net
and easily miss arch/uml/ for instance. When you do treewide changes,
having all your ducklings in the same pond is a lot easier.
There is a possible "risk" with posting a patch series for the
habanalabs driver to netdev that people will be wondering what this is
about and completely miss it is about the networking bits. If there is a
NIC driver under drivers/net then people will start to filter or pay
attention based on the directory.
>
>>
>> Have you considered using notifiers to get your NIC driver registered
>> while the NIC code lives in a different module?
> Yes, and I prefered to keep it simple. I didn't want to start sending
> notifications to the NIC driver every time, for example, I needed to
> reset the SOC because a compute engine got stuck. Or vice-versa - when
> some error happened in the NIC to start sending notifications to the
> common driver.
>
> In addition, from my AMD days, we had a very tough time managing two
> drivers that "talk" to each other and manage the same H/W. I'm talking
> about amdgpu for graphics and amdkfd for compute (which I was the
> maintainer). AMD is working in the past years to unite those two
> drivers to get out of that mess. That's why I didn't want to go down
> that road.
You are trading an indirect call for a direct call, and it does provide
some nice interface, but it could be challenging to work with given the
context in which the notifier is called can be problematic. You could
still have direct module references then, and that would avoid the need
for notifiers.
You are the driver maintainer, so you definitively have a bigger say in
the matter than most of us, drive by contributors.
--
Florian
Powered by blists - more mailing lists