[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240717123337.GI5630@unreal>
Date: Wed, 17 Jul 2024 15:33:37 +0300
From: Leon Romanovsky <leon@...nel.org>
To: Omer Shpigelman <oshpigelman@...ana.ai>
Cc: Jason Gunthorpe <jgg@...pe.ca>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"ogabbay@...nel.org" <ogabbay@...nel.org>,
Zvika Yehudai <zyehudai@...ana.ai>
Subject: Re: [PATCH 11/15] RDMA/hbl: add habanalabs RDMA driver
On Wed, Jul 17, 2024 at 10:51:03AM +0000, Omer Shpigelman wrote:
> On 7/17/24 10:36, Leon Romanovsky wrote:
> > On Wed, Jul 17, 2024 at 07:08:59AM +0000, Omer Shpigelman wrote:
> >> On 7/16/24 16:40, Jason Gunthorpe wrote:
> >>> On Sun, Jul 14, 2024 at 10:18:12AM +0000, Omer Shpigelman wrote:
> >>>> On 7/12/24 16:08, Jason Gunthorpe wrote:
> >>>>> [You don't often get email from jgg@...pe.ca. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> >>>>>
> >>>>> On Fri, Jun 28, 2024 at 10:24:32AM +0000, Omer Shpigelman wrote:
> >>>>>
> >>>>>> We need the core driver to access the IB driver (and to the ETH driver as
> >>>>>> well). As you wrote, we can't use exported symbols from our IB driver nor
> >>>>>> rely on function pointers, but what about providing the core driver an ops
> >>>>>> structure? meaning exporting a register function from the core driver that
> >>>>>> should be called by the IB driver during auxiliary device probe.
> >>>>>> Something like:
> >>>>>>
> >>>>>> int hbl_cn_register_ib_aux_dev(struct auxiliary_device *adev,
> >>>>>> struct hbl_ib_ops *ops)
> >>>>>> {
> >>>>>> ...
> >>>>>> }
> >>>>>> EXPORT_SYMBOL(hbl_cn_register_ib_aux_dev);
> >>>>>
> >>>>> Definately do not do some kind of double-register like this.
> >>>>>
> >>>>> The auxiliary_device scheme can already be extended to provide ops for
> >>>>> each sub device.
> >>>>>
> >>>>> Like
> >>>>>
> >>>>> struct habana_driver {
> >>>>> struct auxiliary_driver base;
> >>>>> const struct habana_ops *ops;
> >>>>> };
> >>>>>
> >>>>> If the ops are justified or not is a different question.
> >>>>>
> >>>>
> >>>> Well, I suggested this double-register option because I got a comment that
> >>>> the design pattern of embedded ops structure shouldn't be used.
> >>>> So I'm confused now...
> >>>
> >>> Yeah, don't stick ops in random places, but the device_driver is the
> >>> right place.
> >>>
> >>
> >> Sorry, let me explain again. My original code has an ops structure
> >> exactly like you are suggesting now (see struct hbl_aux_dev in the first
> >> patch of the series). But I was instructed not to use this ops structure
> >> and to rely on exported symbols for inter-driver communication.
> >> I'll be happy to use this ops structure like in your example rather than
> >> converting my code to use exported symbols.
> >> Leon - am I missing anything? what's the verdict here?
> >
> > You are missing the main sentence from Jason's response: "don't stick ops in random places".
> >
> > It is fine to have ops in device driver, so the core driver can call them. However, in your
> > original code, you added ops everywhere. It caused to the need to implement module reference
> > counting and crazy stuff like calls to lock and unlock functions from the aux driver to the core.
> >
> > Verdict is still the same. Core driver should provide EXPORT_SYMBOLs, so the aux driver can call
> > them directly and enjoy from proper module loading and unloading.
> >
> > The aux driver can have ops in the device driver, so the core driver can call them to perform something
> > specific for that aux driver.
> >
> > Calls between aux drivers should be done via the core driver.
> >
> > Thanks
>
> The only place we have an ops structure is in the device driver,
> similarly to Jason's example. In our code it is struct hbl_aux_dev. What
> other random places did you see?
This is exactly random place.
I suggest you to take time, learn how existing drivers in netdev and
RDMA uses auxbus infrastructure and follow the same pattern. There are
many examples already in the kernel.
And no, if you do everything right, you won't need custom module
reference counting and other hacks. There is nothing special in your
device/driver which requires special treatment.
Thanks
Powered by blists - more mailing lists