netdev - RE: [RFC PATCH 00/13] Ultra Ethernet driver introduction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID:
 <DM6PR12MB43130D3131B760AF2A0C569ABDAC2@DM6PR12MB4313.namprd12.prod.outlook.com>
Date: Tue, 1 Apr 2025 16:57:52 +0000
From: Sean Hefty <shefty@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Bernard Metzler <BMT@...ich.ibm.com>, Roland Dreier
	<roland@...abrica.net>, Nikolay Aleksandrov <nikolay@...abrica.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "shrijeet@...abrica.net"
	<shrijeet@...abrica.net>, "alex.badea@...sight.com"
	<alex.badea@...sight.com>, "eric.davis@...adcom.com"
	<eric.davis@...adcom.com>, "rip.sohan@....com" <rip.sohan@....com>,
	"dsahern@...nel.org" <dsahern@...nel.org>, "winston.liu@...sight.com"
	<winston.liu@...sight.com>, "dan.mihailescu@...sight.com"
	<dan.mihailescu@...sight.com>, Kamal Heib <kheib@...hat.com>,
	"parth.v.parikh@...sight.com" <parth.v.parikh@...sight.com>, Dave Miller
	<davem@...hat.com>, "ian.ziemba@....com" <ian.ziemba@....com>,
	"andrew.tauferner@...nelisnetworks.com"
	<andrew.tauferner@...nelisnetworks.com>, "welch@....com" <welch@....com>,
	"rakhahari.bhunia@...sight.com" <rakhahari.bhunia@...sight.com>,
	"kingshuk.mandal@...sight.com" <kingshuk.mandal@...sight.com>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>, "kuba@...nel.org"
	<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
Subject: RE: [RFC PATCH 00/13] Ultra Ethernet driver introduction

> > Specifically, I want to *allow* separating the different functions
> > that a single PD provides into separate PDs.  The functions being page
> > mapping (registration), local (lkey) access, and remote (rkey) access.
> 
> That seems like quite a stretch for the PD.. Especially from a verbs perspective
> we do expect single PD and that is the entire security context.

>From the viewpoint of a transport, the target QPN and incoming rkey must align on some backing security object (let's call that the PD).  As a model, I view this as there needs to exist some {QPN, rkey, PD ID} tuple with appropriate memory access permissions. 

The change here is to expand that tuple to include a job id: {QPN, rkey, job ID, PD ID}.

Conceptually, one could view the rkey + job ID as a larger, virtual rkey.  (Or maybe job ID + PD ID is a bigger, virtual PD ID...  Or job ID + QPN ...)

> I think you face a philosophical choice of either a bigger PD that encompasses
> multiple jobs, or a PD that isn't a security context and then things like job
> handle lists in other APIs..
>
> > As an optimization, registration can be a separate function, so that
> > the same page mapping can be re-used across different jobs as they
> > start and end.  This requires some ability to import a MR from one PD
> > into another.  This is probably just an optimization and not required
> > for a job model.
> 
> Donno, it depends what the spec says about the labels. Is there an expectation
> that the rkey equivalent is identical across all jobs, or is there an expectation
> that every job has a unique rkey for the same memory?
> 
> I still wouldn't do something like import (which implies sharing the underlying
> page list), having a single MR object with multiple rkeys will make an easier
> implementation.

I don't know that I can talk about the UEC spec, but the libfabric memory registration APIs (UEC has openly mentioned adopting libfabric) are closer to a single MR object with multiple keys.  Different jobs could have different rkeys.

Libfabric defines a 'base MR' and allows 'sub-MRs' to be created from that base.  So, there are separate MR objects for tracking purposes.  A sub-MR has its own access rights, job association, and rkey.

Libfabric doesn't have PDs, but this model is closer to the bigger PD that encompasses multiple jobs.  A job is assigned to the MR at MR creation.

A possible RDMA model could be:

PD <-- QP
   ^--- MR (not affiliated with a job)
   ^--- job thingy  <-- MR (restricted to job)

A device likely needs some capability to indicate whether it can limit MR access by {QPN, rkey, job ID, PD ID}.

I can envision a job manager creating, sharing, and possibly controlling the PD-related resources.

- Sean