lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <DM6PR12MB431337B52F88E8E22323E066BDBC2@DM6PR12MB4313.namprd12.prod.outlook.com>
Date: Thu, 17 Apr 2025 02:59:58 +0000
From: Sean Hefty <shefty@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: "Ziemba, Ian" <ian.ziemba@....com>, Bernard Metzler <BMT@...ich.ibm.com>,
	Roland Dreier <roland@...abrica.net>, Nikolay Aleksandrov
	<nikolay@...abrica.net>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"shrijeet@...abrica.net" <shrijeet@...abrica.net>, "alex.badea@...sight.com"
	<alex.badea@...sight.com>, "eric.davis@...adcom.com"
	<eric.davis@...adcom.com>, "rip.sohan@....com" <rip.sohan@....com>,
	"dsahern@...nel.org" <dsahern@...nel.org>, "winston.liu@...sight.com"
	<winston.liu@...sight.com>, "dan.mihailescu@...sight.com"
	<dan.mihailescu@...sight.com>, Kamal Heib <kheib@...hat.com>,
	"parth.v.parikh@...sight.com" <parth.v.parikh@...sight.com>, Dave Miller
	<davem@...hat.com>, "andrew.tauferner@...nelisnetworks.com"
	<andrew.tauferner@...nelisnetworks.com>, "welch@....com" <welch@....com>,
	"rakhahari.bhunia@...sight.com" <rakhahari.bhunia@...sight.com>,
	"kingshuk.mandal@...sight.com" <kingshuk.mandal@...sight.com>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>, "kuba@...nel.org"
	<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
Subject: RE: [RFC PATCH 00/13] Ultra Ethernet driver introduction

> On Wed, Apr 16, 2025 at 11:58:45PM +0000, Sean Hefty wrote:
> > > > There's discussion on defining this relationship:
> > > >
> > > > Job <- 0..n --- 1 -> PD
> > > >
> > > > I can't think of a technical reason why that's needed.
> > >
> > > From my UE perspective, I agree. UE needs to share job IDs across
> > > processes while still having inter-process isolation for things like
> > > local memory registrations.
> >
> > We seem stuck on this.  Here's a specific proposal that I'm considering:
> 
> I still think it is hard to have this discussion without information flowing from
> UET..
> 
> I think the "Relative Addressing" Ian described is just a PD pointing to a single
> job and all MRs within the PD linked to a single job. Is there more than that?

Relative / absolute addressing is in regard to the endpoint address.  I.e. the equivalent of the QPN.

With relative addressing, the QPN is relative to the job ID.  So QPN=5 for job=2 and QPN=5 for job=3 may or may not be the same HW resource.  A HW QP may still belong to multiple jobs, if supported by the vendor.

> "Absolute Addressing" seems confusing from a OS perspective. You can
> receive packets on any Job ID but the OS prevents you from sending on
> unauthorized Job IDs. Implying authorization happens dynamically.  So if you
> Rx a packet, how does an unpriv process go about getting OS permission to
> use the Rx'd Job ID as a Tx? How does it NAK the Rx that it isn't permitted?
> Why would you want to create an entire special security mechanism just to
> partition MRs in this funny mode?

Absolute addressing means the QPN is basically relative to the IP address.  So, the HW resource can be located without using the job ID.  Job IDs are carried in the transport, so every send must indicate what that value should be.

As an example, assigning MRs to jobs allows the server to setup RMA buffers with access restricted to that job.

I have no idea how the receiver plans to enable sending back a response.

> How does receive buffer job key partitioning work? UET will HW match receive
> buffers to specific packets?

Not directly.  Libfabric has 2 features useful to consider here.  The simplest is tag matching.  Different jobs could use different tags bits.  MR partitioning can enforce one job doesn't try to jump into another job's tag space.  The second feature is called scalable endpoints.  A scalable endpoint has multiple receive queues, which are directly addressable by the peer.  Different jobs could target different receive queues.

> > 1. Define a device level 'security key'.  The skey encapsulates encryption
> attributes.
> >     The skey may be shared between processes.
> > 2. Define a device level 'job', or maybe more generic 'communication
> domain'*.
> >     A job object is associated with a transport protocol and these optional
> attributes:
> >     address, job id (required for UET), and security key.
> >     The job object may be shared between processes.
> > 3. Define a PD level 'job key'.  The job key references a single job object.
> >     Multiple job keys may be created under a single PD, if each references a
> separate job.
> > 4. Support creating MRs that reference job keys.
> 
> This seems reasonable as a starting framework to me. I have wondered if the
> 'security key' is really addressing information though. Sharing
> IP's/MAC's/Encryption/etc across all job users seems appealing for MPI type
> workloads.

I've gone back and forth between separating and combining the 'security key' and job objects.  Today I opted for separate, more focused objects.  Tomorrow, who knows?  Job is where addressing information goes.  Since security key is passed as an attribute to the job, an MPI/AI job can share encryption/IPs/etc. across processes.  (Btw, I prefer the term 'comm domain' over job for this top-level object, but I don't know if that makes things more or less confusing for others.  Job starts taking on different meanings.)

A separate security key made more sense to me when I considered applying it to an RC QP.  Additionally, an MPI/AI job may require multiple job objects, one for each IP address.  (Imagine a system connected to separate networks, such that the job ID value cannot be global).  A single security key can be used with all job instances.

> But is one job key under a MR sufficient or does UET expect this to be a list of
> job keys?

One, I believe.  Libfabric allows a MR to attach to a single job.  However, it does support derivative MRs, which could have different properties, but share page mappings.

- Sean

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ