[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240216142713.GC13330@nvidia.com>
Date: Fri, 16 Feb 2024 10:27:13 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Andy Gospodarek <andrew.gospodarek@...adcom.com>,
Christoph Hellwig <hch@...radead.org>,
Saeed Mahameed <saeed@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Leon Romanovsky <leonro@...dia.com>, Jiri Pirko <jiri@...dia.com>,
Leonid Bloch <lbloch@...dia.com>, Itay Avraham <itayavr@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>,
David Ahern <dsahern@...nel.org>,
Aron Silverton <aron.silverton@...cle.com>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH V4 0/5] mlx5 ConnectX control misc driver
On Thu, Feb 15, 2024 at 05:40:34PM -0800, Jakub Kicinski wrote:
> On Wed, 14 Feb 2024 14:37:55 -0400 Jason Gunthorpe wrote:
> > On Wed, Feb 14, 2024 at 10:11:26AM -0800, Jakub Kicinski wrote:
> > > On Wed, 14 Feb 2024 13:57:35 -0400 Jason Gunthorpe wrote:
> > > > There is a clear split in my mind between:
> > > > - inspection debugging
> > > > - invasive mutating debugging
> > > > - configuration
> > >
> > > Yes there's a clear split, and how are you going to enforce it on
> > > an opaque interface? Put an "evil" bit in the common header?
> >
> > The interface is opaque through a subsystem, it doesn't mean it is
> > completely opaque to every driver layer in the kernel. There is still a
> > HW specific kernel driver that delivers the FW command to the actual
> > HW.
> >
> > In the mlx5 model the kernel driver stamps the command with "uid"
> > which is effectively a security scope label. This cannot be avoided by
> > userspace and is fundamental to why mlx5ctl is secure in a lockdown
> > kernel.
> >
> > For example mlx5's FW interface has the concept of security scopes. We
> > have several defined today:
> > - Kernel
> > - Userspace rdma
> > - Userspace rdma with CAP_NET_RAW
> > - Userspace rdma with CAP_SYS_RAWIO
> >
> > So we trivally add three more for the scopes I listed above. The
> > mlx5ctl driver as posted already introduced a new scope, for example.
> >
> > Userspace will ask the kernel for an appropriate security scope after
> > opening the char-device. If userspace asks for invasive then you get a
> > taint. Issuing an invasive command without a kernel applied invasive
> > security label will be rejected by the FW.
> >
> > We trust the kernel to apply the security label for the origin of the
> > command. We trust the the device FW to implement security scopes,
> > because these are RDMA devices and all of RDMA and all of SRIOV
> > virtualization are totally broken if the device FW cannot be trusted
> > to maintain security separation between scopes.
>
> You have changed the argument.
I explained how the technical bits of a part work, you clipped out my
answer to Andy's concern.
> The problem Andy was raising is that users having access to low level
> configuration will make it impossible for distro's support to tell
> device configuration. There won't be any trace of activity at the OS
> level.
I responded to that by saying the answer is to have robust dumping of
the device configuration and suggested a taint bit if changes are made
to the device outside that support envelope and can't be captured in
the dumps.
This first part is already what everyone already does. There is some
supported configuration in flash and there are tools to dump and
inspect this. The field teams understand they need to look at that,
and existing data collection tools already capture this stuff. I don't
view we have a real problem here.
The step beyond Andy was talking about is the hypothetical "what if
you touch random unsafe registers directly or something" Which
probably shouldn't be allowed on a lockdown kernel, but assuming a
device could do so safely, my answer was to trigger a taint.
Then you asked how do you trigger a taint if the kernel doesn't parse
the commands.
> To which you replied that you can differentiate between debugging and
> configuration on an opaque interface, _in the kernel_.
I did not say "_in the kernel_" meaning the kernel would do it, I
meant the kernel would ensure it is done.
The kernel delegates the differentiation to the FW and it trusts the
FW to do that work for it.
> Which I disagree with, obviously.
I don't know why? How is it obvious?
> And now you're saying that you can maintain security if you trust
> the firmware to enforce some rules.
Right. The userspace sends a command. The kernel tags it with the
permission the userspace has. The FW parses the command and checks the
kernel supplied permission against the command content and permits it.
We trust the FW to do the restriction on behalf of the kernel.
The restriction we are talking about is containing the userspace to
only operate within the distro's support envelope. The kernel can ask
the FW to enforce that rule as a matter of security and trust the FW
to do so.
Jason
Powered by blists - more mailing lists