[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <0-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com>
Date: Thu, 27 Feb 2025 20:26:28 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To:
Cc: Andy Gospodarek <andrew.gospodarek@...adcom.com>,
Aron Silverton <aron.silverton@...cle.com>,
Dan Williams <dan.j.williams@...el.com>,
Daniel Vetter <daniel.vetter@...ll.ch>,
Dave Jiang <dave.jiang@...el.com>,
David Ahern <dsahern@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Christoph Hellwig <hch@...radead.org>,
Itay Avraham <itayavr@...dia.com>,
Jiri Pirko <jiri@...dia.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Jakub Kicinski <kuba@...nel.org>,
Leonid Bloch <lbloch@...dia.com>,
Leon Romanovsky <leonro@...dia.com>,
linux-cxl@...r.kernel.org,
linux-rdma@...r.kernel.org,
netdev@...r.kernel.org,
Saeed Mahameed <saeedm@...dia.com>,
"Nelson, Shannon" <shannon.nelson@....com>
Subject: [PATCH v5 0/8] Introduce fwctl subystem
[
We now have the required three drivers on the list so things are
looking probable for reaching this merge window. I will work out some
shared branches with CXL and get it into linux-next once all three drivers
can be assembled and reviews seem to be concluding.
There are couple open notes
- Greg was interested in a new name, Dan offered auxctl
]
fwctl is a new subsystem intended to bring some common rules and order to
the growing pattern of exposing a secure FW interface directly to
userspace. Unlike existing places like RDMA/DRM/VFIO/uacce that are
exposing a device for datapath operations fwctl is focused on debugging,
configuration and provisioning of the device. It will not have the
necessary features like interrupt delivery to support a datapath.
This concept is similar to the long standing practice in the "HW" RAID
space of having a device specific misc device to manage the RAID
controller FW. fwctl generalizes this notion of a companion debug and
management interface that goes along with a dataplane implemented in an
appropriate subsystem.
The need for this has reached a critical point as many users are moving to
run lockdown enabled kernels. Several existing devices have had long
standing tooling for management that relied on /sys/../resource0 or PCI
config space access which is not permitted in lockdown. A major point of
fwctl is to define and document the rules that a device must follow to
expose a lockdown compatible RPC.
Based on some discussion fwctl splits the RPCs into four categories
FWCTL_RPC_CONFIGURATION
FWCTL_RPC_DEBUG_READ_ONLY
FWCTL_RPC_DEBUG_WRITE
FWCTL_RPC_DEBUG_WRITE_FULL
Where the latter two trigger a new TAINT_FWCTL, and the final one requires
CAP_SYS_RAWIO - excluding it from lockdown. The device driver and its FW
would be responsible to restrict RPCs to the requested security scope,
while the core code handles the tainting and CAP checks.
For details see the final patch which introduces the documentation.
The CXL FWCTL driver is now in it own series on v7:
https://lore.kernel.org/r/20250220194438.2281088-1-dave.jiang@intel.com
And a driver for the Pensando DSC (A smart NIC):
https://lore.kernel.org/r/20250211234854.52277-1-shannon.nelson@amd.com
I've got soft commitments for about 7 drivers in total now.
There have been three LWN articles written discussing various aspects of
this proposal:
https://lwn.net/Articles/955001/
https://lwn.net/Articles/969383/
https://lwn.net/Articles/990802/
A really giant ksummit thread preceding a discussion at the Maintainer
Summit:
https://lore.kernel.org/ksummit/668c67a324609_ed99294c0@dwillia2-xfh.jf.intel.com.notmuch/
Several have expressed general support for this concept:
AMD/Pensando - https://lore.kernel.org/linux-rdma/20241205222818.44439-1-shannon.nelson@amd.com
Broadcom Networking - https://lore.kernel.org/r/Zf2n02q0GevGdS-Z@C02YVCJELVCG
Christoph Hellwig - https://lore.kernel.org/r/Zcx53N8lQjkpEu94@infradead.org
Daniel Vetter - https://lore.kernel.org/r/ZrHY2Bds7oF7KRGz@phenom.ffwll.local
Enfabrica - https://lore.kernel.org/r/9cc7127f-8674-43bc-b4d7-b1c4c2d96fed@kernel.org
NVIDIA Networking
Oded Gabbay/Habana - https://lore.kernel.org/r/ZrMl1bkPP-3G9B4N@T14sgabbay.
Oracle Linux - https://lore.kernel.org/r/6lakj6lxlxhdgrewodvj3xh6sxn3d36t5dab6najzyti2navx3@wrge7cyfk6nq
SuSE/Hannes - https://lore.kernel.org/r/2fd48f87-2521-4c34-8589-dbb7e91bb1c8@suse.com
Work is ongoing for userspace, currently the mellanox tool suite has been
ported over:
https://github.com/Mellanox/mstflint
And a more simplified example how to use it:
https://github.com/jgunthorpe/mlx5ctl.git
This is on github: https://github.com/jgunthorpe/linux/commits/fwctl
v5:
- Move hunks between patches to make more sense
- Rename ucmd_buffer to fwctl_ucmd_buffer
- Update comments and commit messages
- Copyright to 2025
- Drop bxnt WIP patches
- Allow a NULL ops->info
- Decode more op codes for mlx5 and the sub-operation for
MLX5_CMD_OP_ACCESS_REG/_USER
v4: https://patch.msgid.link/r/0-v4-0cf4ec3b8143+4995-fwctl_jgg@nvidia.com
- Rebase to v6.14-rc1
- Fine tune comments and rst documentatin
- Adjust cleanup.h usage - remove places that add more ofuscation than
value
- CXL is back to its own independent series
- Increase FWCTL_MAX_DEVICES to 4096, someone hit the limit
- Fix mlx5ctl_validate_rpc() logic around scope checking
- Disable mlx5ctl on SFs
v3: https://patch.msgid.link/r/0-v3-960f17f90f17+516-fwctl_jgg@nvidia.com
- Rebase to v6.11-rc4
- Add a squashed version of David's CXL series as the 2nd driver
- Add missing includes
- Improve comments based on feedback
- Use the kdoc format that puts the member docs inside the struct
- Rewrite fwctl_alloc_device() to be clearer
- Incorporate all remarks for the documentation
v2: https://lore.kernel.org/r/0-v2-940e479ceba9+3821-fwctl_jgg@nvidia.com
- Rebase to v6.10-rc5
- Minor style changes
- Follow the style consensus for the guard stuff
- Documentation grammer/spelling
- Add missed length output for mlx5 get_info
- Add two more missed MLX5 CMD's
- Collect tags
v1: https://lore.kernel.org/r/0-v1-9912f1a11620+2a-fwctl_jgg@nvidia.com
Cc: Andy Gospodarek <andrew.gospodarek@...adcom.com>
Cc: Aron Silverton <aron.silverton@...cle.com>
Cc: Christoph Hellwig <hch@...radead.org>
Cc: David Ahern <dsahern@...nel.org>
Cc: Itay Avraham <itayavr@...dia.com>
Cc: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org
Cc: Jiri Pirko <jiri@...dia.com>
Cc: Leon Romanovsky <leonro@...dia.com>
Cc: Leonid Bloch <lbloch@...dia.com>
Cc: Dan Williams <dan.j.williams@...el.com>
Cc: linux-cxl@...r.kernel.org
Cc: linux-rdma@...r.kernel.org
Cc: "Nelson, Shannon" <shannon.nelson@....com>
Cc: Dave Jiang <dave.jiang@...el.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Signed-off-by: Jason Gunthorpe <jgg@...dia.com>
Jason Gunthorpe (6):
fwctl: Add basic structure for a class subsystem with a cdev
fwctl: Basic ioctl dispatch for the character device
fwctl: FWCTL_INFO to return basic information about the device
taint: Add TAINT_FWCTL
fwctl: FWCTL_RPC to execute a Remote Procedure Call to device firmware
fwctl: Add documentation
Saeed Mahameed (2):
fwctl/mlx5: Support for communicating with mlx5 fw
mlx5: Create an auxiliary device for fwctl_mlx5
Documentation/admin-guide/tainted-kernels.rst | 5 +
Documentation/userspace-api/fwctl/fwctl.rst | 285 ++++++++++++
Documentation/userspace-api/fwctl/index.rst | 12 +
Documentation/userspace-api/index.rst | 1 +
.../userspace-api/ioctl/ioctl-number.rst | 1 +
MAINTAINERS | 18 +
drivers/Kconfig | 2 +
drivers/Makefile | 1 +
drivers/fwctl/Kconfig | 23 +
drivers/fwctl/Makefile | 5 +
drivers/fwctl/main.c | 421 ++++++++++++++++++
drivers/fwctl/mlx5/Makefile | 4 +
drivers/fwctl/mlx5/main.c | 411 +++++++++++++++++
drivers/net/ethernet/mellanox/mlx5/core/dev.c | 9 +
include/linux/fwctl.h | 135 ++++++
include/linux/panic.h | 3 +-
include/uapi/fwctl/fwctl.h | 139 ++++++
include/uapi/fwctl/mlx5.h | 36 ++
kernel/panic.c | 1 +
tools/debugging/kernel-chktaint | 8 +
20 files changed, 1519 insertions(+), 1 deletion(-)
create mode 100644 Documentation/userspace-api/fwctl/fwctl.rst
create mode 100644 Documentation/userspace-api/fwctl/index.rst
create mode 100644 drivers/fwctl/Kconfig
create mode 100644 drivers/fwctl/Makefile
create mode 100644 drivers/fwctl/main.c
create mode 100644 drivers/fwctl/mlx5/Makefile
create mode 100644 drivers/fwctl/mlx5/main.c
create mode 100644 include/linux/fwctl.h
create mode 100644 include/uapi/fwctl/fwctl.h
create mode 100644 include/uapi/fwctl/mlx5.h
base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b
--
2.43.0
Powered by blists - more mailing lists