[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<DM6PR12MB431332A6407547B225849F88BDAD2@DM6PR12MB4313.namprd12.prod.outlook.com>
Date: Mon, 31 Mar 2025 19:29:32 +0000
From: Sean Hefty <shefty@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Bernard Metzler <BMT@...ich.ibm.com>, Roland Dreier
<roland@...abrica.net>, Nikolay Aleksandrov <nikolay@...abrica.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "shrijeet@...abrica.net"
<shrijeet@...abrica.net>, "alex.badea@...sight.com"
<alex.badea@...sight.com>, "eric.davis@...adcom.com"
<eric.davis@...adcom.com>, "rip.sohan@....com" <rip.sohan@....com>,
"dsahern@...nel.org" <dsahern@...nel.org>, "winston.liu@...sight.com"
<winston.liu@...sight.com>, "dan.mihailescu@...sight.com"
<dan.mihailescu@...sight.com>, Kamal Heib <kheib@...hat.com>,
"parth.v.parikh@...sight.com" <parth.v.parikh@...sight.com>, Dave Miller
<davem@...hat.com>, "ian.ziemba@....com" <ian.ziemba@....com>,
"andrew.tauferner@...nelisnetworks.com"
<andrew.tauferner@...nelisnetworks.com>, "welch@....com" <welch@....com>,
"rakhahari.bhunia@...sight.com" <rakhahari.bhunia@...sight.com>,
"kingshuk.mandal@...sight.com" <kingshuk.mandal@...sight.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>, "kuba@...nel.org"
<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
Subject: RE: [RFC PATCH 00/13] Ultra Ethernet driver introduction
> > I have a proposal to rework/redefine PDs to support a more general
> > model,
>
> It would certainly be good to have some text explaining some of the mappings
> to different technologies.
>
> > which I think will work for NICs that
> > need a PD and ones that don't. It can support MR -> PD -> Job, but I
> > considered the PD -> job relationship as 1 to many.
>
> Yes, and the 1:1 is degenerate.
>
> > Sure, It's challenging in that a UET endpoint (QP) may communicate
> > with multiple jobs, and a MR may be accessible by a single job, all
> > jobs, or only a few.
>
> I would suggest that the PD is a superset of all jobs and the objects (endpoint,
> mr, etc) get to choose a subset of the PD's jobs during allocation?
>
> Or you keep job/pd as 1:1 and allow specifying multiple PDs during object
> allocation.
>
> But to be clear, this is largely verbs modeling stuff - however there is a certain
> practicality to trying to fit this multi-job ability into a PD because it allow
> reusing alot of existing uAPI kernel code.
>
> Especially if people are going to take existing RDMA HW and tweak it to some
> level of UET (ie support only single job) and still require a HW level PD under
> the covers.
Yes, I'm trying to ensure that the existing RDMA model continues to work but also support NICs/transports which implement the equivalent security model at the QP (endpoint) level, reusing the PD for both.
Specifically, I want to *allow* separating the different functions that a single PD provides into separate PDs. The functions being page mapping (registration), local (lkey) access, and remote (rkey) access. The RDMA model limits a QP to a single PD for all. To support job-based transports, I propose allowing a QP to use 1 PD for local access (PD specified at QP creation) and multiple PDs for remote access. Each PD used for remote access would correspond to a different job.
Note: a NIC may limit a QP to being used with a single job and require the local and remote PD be the same (i.e. 1 pd per qp). So, the RDMA model still fits.
As an optimization, registration can be a separate function, so that the same page mapping can be re-used across different jobs as they start and end. This requires some ability to import a MR from one PD into another. This is probably just an optimization and not required for a job model.
I was still envisioning a job manager allocating device specific resources for a job and sharing those with the local processes. I.e. it shares a set of fd's, with each fd associated with a device, which restricts the job to those devices. A job may also have device specific resource limits or allocations (limit on number of MRs, specific endpoint addresses, etc.) A global job object could work, but a subsequent user to device flow will need to access and translate the global object. Either way, there's uABI requirement(s).
- Sean
Powered by blists - more mailing lists