[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <346ad61e-9cba-4915-8748-0b8119358d7a@amd.com>
Date: Wed, 12 Feb 2025 18:30:38 -0800
From: "Nelson, Shannon" <shannon.nelson@....com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Andy Gospodarek <andrew.gospodarek@...adcom.com>,
Aron Silverton <aron.silverton@...cle.com>,
Dan Williams <dan.j.williams@...el.com>,
Daniel Vetter <daniel.vetter@...ll.ch>, Dave Jiang <dave.jiang@...el.com>,
David Ahern <dsahern@...nel.org>, Andy Gospodarek <gospo@...adcom.com>,
Christoph Hellwig <hch@...radead.org>, Itay Avraham <itayavr@...dia.com>,
Jiri Pirko <jiri@...dia.com>, Jonathan Cameron
<Jonathan.Cameron@...wei.com>, Jakub Kicinski <kuba@...nel.org>,
Leonid Bloch <lbloch@...dia.com>, Leon Romanovsky <leonro@...dia.com>,
linux-cxl@...r.kernel.org, linux-rdma@...r.kernel.org,
netdev@...r.kernel.org, Saeed Mahameed <saeedm@...dia.com>
Subject: Re: [PATCH v4 00/10] Introduce fwctl subystem
On 2/6/2025 4:13 PM, Jason Gunthorpe wrote:
>
> [
> Many people were away around the holiday period, but work is back in full
> swing now with Dave already at v3 on his CXL work over the past couple
> weeks. We are looking at a good chance of reaching this merge window. I
> will work out some shared branches with CXL and get it into linux-next
> once all three drivers can be assembled and reviews seem to be concluding.
>
> There are couple open notes
> - Greg was interested in a new name, but nobody offered any bikesheds
> - I would like a co-maintainer
> ]
>
> fwctl is a new subsystem intended to bring some common rules and order to
> the growing pattern of exposing a secure FW interface directly to
> userspace. Unlike existing places like RDMA/DRM/VFIO/uacce that are
> exposing a device for datapath operations fwctl is focused on debugging,
> configuration and provisioning of the device. It will not have the
> necessary features like interrupt delivery to support a datapath.
>
> This concept is similar to the long standing practice in the "HW" RAID
> space of having a device specific misc device to manage the RAID
> controller FW. fwctl generalizes this notion of a companion debug and
> management interface that goes along with a dataplane implemented in an
> appropriate subsystem.
>
> The need for this has reached a critical point as many users are moving to
> run lockdown enabled kernels. Several existing devices have had long
> standing tooling for management that relied on /sys/../resource0 or PCI
> config space access which is not permitted in lockdown. A major point of
> fwctl is to define and document the rules that a device must follow to
> expose a lockdown compatible RPC.
>
> Based on some discussion fwctl splits the RPCs into four categories
>
> FWCTL_RPC_CONFIGURATION
> FWCTL_RPC_DEBUG_READ_ONLY
> FWCTL_RPC_DEBUG_WRITE
> FWCTL_RPC_DEBUG_WRITE_FULL
>
> Where the latter two trigger a new TAINT_FWCTL, and the final one requires
> CAP_SYS_RAWIO - excluding it from lockdown. The device driver and its FW
> would be responsible to restrict RPCs to the requested security scope,
> while the core code handles the tainting and CAP checks.
>
> For details see the final patch which introduces the documentation.
>
> The CXL FWCTL driver is now in it own series on v3:
> https://lore.kernel.org/r/20250204220430.4146187-1-dave.jiang@intel.com
>
> I'm expecting a 3rd driver (from Shannon @ Pensando) to be posted right
> away, the github version I saw looked good. I've got soft commitments for
> about 6 drivers in total now.
Hi Jason,
I've looked through the core code and didn't see anything that other
haven't already commented on. I didn't go through the mlx5 or bnxt code
very carefully, but you can put my Reviewed-by on your first 6 patches.
We've been running successfully with an earlier version of the code, but
haven't set up our full test environment with this version yet. Since
there doesn't seem to be much change here, you are welcome to my
Tested-by as well.
For the first 6 patches:
Reviewed-by: Shannon Nelson <shannon.nelson@....com>
Tested-by: Shannon Nelson <shannon.nelson@....com>
Cheers,
sln
>
> There have been three LWN articles written discussing various aspects of
> this proposal:
>
> https://lwn.net/Articles/955001/
> https://lwn.net/Articles/969383/
> https://lwn.net/Articles/990802/
>
> A really giant ksummit thread preceding a discussion at the Maintainer
> Summit:
>
> https://lore.kernel.org/ksummit/668c67a324609_ed99294c0@dwillia2-xfh.jf.intel.com.notmuch/
>
> Several have expressed general support for this concept:
>
> AMD/Pensando - https://lore.kernel.org/linux-rdma/20241205222818.44439-1-shannon.nelson@amd.com
> Broadcom Networking - https://lore.kernel.org/r/Zf2n02q0GevGdS-Z@C02YVCJELVCG
> Christoph Hellwig - https://lore.kernel.org/r/Zcx53N8lQjkpEu94@infradead.org
> Daniel Vetter - https://lore.kernel.org/r/ZrHY2Bds7oF7KRGz@phenom.ffwll.local
> Enfabrica - https://lore.kernel.org/r/9cc7127f-8674-43bc-b4d7-b1c4c2d96fed@kernel.org
> NVIDIA Networking
> Oded Gabbay/Habana - https://lore.kernel.org/r/ZrMl1bkPP-3G9B4N@T14sgabbay.
> Oracle Linux - https://lore.kernel.org/r/6lakj6lxlxhdgrewodvj3xh6sxn3d36t5dab6najzyti2navx3@wrge7cyfk6nq
> SuSE/Hannes - https://lore.kernel.org/r/2fd48f87-2521-4c34-8589-dbb7e91bb1c8@suse.com
>
> Work is ongoing for userspace, currently the mellanox tool suite has been
> ported over:
> https://github.com/Mellanox/mstflint
>
> And a more simplified example how to use it:
> https://github.com/jgunthorpe/mlx5ctl.git
>
> This is on github: https://github.com/jgunthorpe/linux/commits/fwctl
>
> v4:
> - Rebase to v6.14-rc1
> - Fine tune comments and rst documentatin
> - Adjust cleanup.h usage - remove places that add more ofuscation than
> value
> - CXL is back to its own independent series
> - Increase FWCTL_MAX_DEVICES to 4096, someone hit the limit
> - Fix mlx5ctl_validate_rpc() logic around scope checking
> - Disable mlx5ctl on SFs
> v3: https://patch.msgid.link/r/0-v3-960f17f90f17+516-fwctl_jgg@nvidia.com
> - Rebase to v6.11-rc4
> - Add a squashed version of David's CXL series as the 2nd driver
> - Add missing includes
> - Improve comments based on feedback
> - Use the kdoc format that puts the member docs inside the struct
> - Rewrite fwctl_alloc_device() to be clearer
> - Incorporate all remarks for the documentation
> v2: https://lore.kernel.org/r/0-v2-940e479ceba9+3821-fwctl_jgg@nvidia.com
> - Rebase to v6.10-rc5
> - Minor style changes
> - Follow the style consensus for the guard stuff
> - Documentation grammer/spelling
> - Add missed length output for mlx5 get_info
> - Add two more missed MLX5 CMD's
> - Collect tags
> v1: https://lore.kernel.org/r/0-v1-9912f1a11620+2a-fwctl_jgg@nvidia.com
>
> Andy Gospodarek (2):
> fwctl/bnxt: Support communicating with bnxt fw
> bnxt: Create an auxiliary device for fwctl_bnxt
>
> Jason Gunthorpe (6):
> fwctl: Add basic structure for a class subsystem with a cdev
> fwctl: Basic ioctl dispatch for the character device
> fwctl: FWCTL_INFO to return basic information about the device
> taint: Add TAINT_FWCTL
> fwctl: FWCTL_RPC to execute a Remote Procedure Call to device firmware
> fwctl: Add documentation
>
> Saeed Mahameed (2):
> fwctl/mlx5: Support for communicating with mlx5 fw
> mlx5: Create an auxiliary device for fwctl_mlx5
>
> Documentation/admin-guide/tainted-kernels.rst | 5 +
> Documentation/userspace-api/fwctl/fwctl.rst | 285 ++++++++++++
> Documentation/userspace-api/fwctl/index.rst | 12 +
> Documentation/userspace-api/index.rst | 1 +
> .../userspace-api/ioctl/ioctl-number.rst | 1 +
> MAINTAINERS | 16 +
> drivers/Kconfig | 2 +
> drivers/Makefile | 1 +
> drivers/fwctl/Kconfig | 32 ++
> drivers/fwctl/Makefile | 6 +
> drivers/fwctl/bnxt/Makefile | 4 +
> drivers/fwctl/bnxt/bnxt.c | 167 +++++++
> drivers/fwctl/main.c | 416 ++++++++++++++++++
> drivers/fwctl/mlx5/Makefile | 4 +
> drivers/fwctl/mlx5/main.c | 340 ++++++++++++++
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +
> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 3 +
> drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 126 +++++-
> drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h | 4 +
> drivers/net/ethernet/mellanox/mlx5/core/dev.c | 9 +
> include/linux/fwctl.h | 135 ++++++
> include/linux/panic.h | 3 +-
> include/uapi/fwctl/bnxt.h | 27 ++
> include/uapi/fwctl/fwctl.h | 140 ++++++
> include/uapi/fwctl/mlx5.h | 36 ++
> kernel/panic.c | 1 +
> tools/debugging/kernel-chktaint | 8 +
> 27 files changed, 1782 insertions(+), 5 deletions(-)
> create mode 100644 Documentation/userspace-api/fwctl/fwctl.rst
> create mode 100644 Documentation/userspace-api/fwctl/index.rst
> create mode 100644 drivers/fwctl/Kconfig
> create mode 100644 drivers/fwctl/Makefile
> create mode 100644 drivers/fwctl/bnxt/Makefile
> create mode 100644 drivers/fwctl/bnxt/bnxt.c
> create mode 100644 drivers/fwctl/main.c
> create mode 100644 drivers/fwctl/mlx5/Makefile
> create mode 100644 drivers/fwctl/mlx5/main.c
> create mode 100644 include/linux/fwctl.h
> create mode 100644 include/uapi/fwctl/bnxt.h
> create mode 100644 include/uapi/fwctl/fwctl.h
> create mode 100644 include/uapi/fwctl/mlx5.h
>
>
> base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b
> --
> 2.43.0
>
Powered by blists - more mailing lists