lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <DM6PR12MB431332A6407547B225849F88BDAD2@DM6PR12MB4313.namprd12.prod.outlook.com>
Date: Mon, 31 Mar 2025 19:29:32 +0000
From: Sean Hefty <shefty@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Bernard Metzler <BMT@...ich.ibm.com>, Roland Dreier
	<roland@...abrica.net>, Nikolay Aleksandrov <nikolay@...abrica.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "shrijeet@...abrica.net"
	<shrijeet@...abrica.net>, "alex.badea@...sight.com"
	<alex.badea@...sight.com>, "eric.davis@...adcom.com"
	<eric.davis@...adcom.com>, "rip.sohan@....com" <rip.sohan@....com>,
	"dsahern@...nel.org" <dsahern@...nel.org>, "winston.liu@...sight.com"
	<winston.liu@...sight.com>, "dan.mihailescu@...sight.com"
	<dan.mihailescu@...sight.com>, Kamal Heib <kheib@...hat.com>,
	"parth.v.parikh@...sight.com" <parth.v.parikh@...sight.com>, Dave Miller
	<davem@...hat.com>, "ian.ziemba@....com" <ian.ziemba@....com>,
	"andrew.tauferner@...nelisnetworks.com"
	<andrew.tauferner@...nelisnetworks.com>, "welch@....com" <welch@....com>,
	"rakhahari.bhunia@...sight.com" <rakhahari.bhunia@...sight.com>,
	"kingshuk.mandal@...sight.com" <kingshuk.mandal@...sight.com>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>, "kuba@...nel.org"
	<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
Subject: RE: [RFC PATCH 00/13] Ultra Ethernet driver introduction

> >  I have a proposal to rework/redefine PDs to support a more general
> > model,
> 
> It would certainly be good to have some text explaining some of the mappings
> to different technologies.
> 
> > which I think will work for NICs that
> > need a PD and ones that don't.  It can support MR -> PD -> Job, but I
> > considered the PD -> job relationship as 1 to many.
> 
> Yes, and the 1:1 is degenerate.
> 
> > Sure, It's challenging in that a UET endpoint (QP) may communicate
> > with multiple jobs, and a MR may be accessible by a single job, all
> > jobs, or only a few.
> 
> I would suggest that the PD is a superset of all jobs and the objects (endpoint,
> mr, etc) get to choose a subset of the PD's jobs during allocation?
> 
> Or you keep job/pd as 1:1 and allow specifying multiple PDs during object
> allocation.
> 
> But to be clear, this is largely verbs modeling stuff - however there is a certain
> practicality to trying to fit this multi-job ability into a PD because it allow
> reusing alot of existing uAPI kernel code.
> 
> Especially if people are going to take existing RDMA HW and tweak it to some
> level of UET (ie support only single job) and still require a HW level PD under
> the covers.

Yes, I'm trying to ensure that the existing RDMA model continues to work but also support NICs/transports which implement the equivalent security model at the QP (endpoint) level, reusing the PD for both.

Specifically, I want to *allow* separating the different functions that a single PD provides into separate PDs.  The functions being page mapping (registration), local (lkey) access, and remote (rkey) access.  The RDMA model limits a QP to a single PD for all.  To support job-based transports, I propose allowing a QP to use 1 PD for local access (PD specified at QP creation) and multiple PDs for remote access.  Each PD used for remote access would correspond to a different job.

Note: a NIC may limit a QP to being used with a single job and require the local and remote PD be the same (i.e. 1 pd per qp).  So, the RDMA model still fits.

As an optimization, registration can be a separate function, so that the same page mapping can be re-used across different jobs as they start and end.  This requires some ability to import a MR from one PD into another.  This is probably just an optimization and not required for a job model.

I was still envisioning a job manager allocating device specific resources for a job and sharing those with the local processes.  I.e. it shares a set of fd's, with each fd associated with a device, which restricts the job to those devices.  A job may also have device specific resource limits or allocations (limit on number of MRs, specific endpoint addresses, etc.)  A global job object could work, but a subsequent user to device flow will need to access and translate the global object.  Either way, there's uABI requirement(s).

- Sean

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ