[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <25237d53-a93d-4c1f-a7a4-4b6ed03e10e4@oracle.com>
Date: Sat, 30 Aug 2025 01:02:21 +0530
From: ALOK TIWARI <alok.a.tiwari@...cle.com>
To: admiyo@...amperecomputing.com, Jeremy Kerr <jk@...econstruct.com.au>,
Matt Johnston <matt@...econstruct.com.au>,
Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller"
<davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
Sudeep Holla <sudeep.holla@....com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Huisong Li <lihuisong@...wei.com>
Subject: Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC
Transport
On 8/27/2025 10:18 AM, admiyo@...amperecomputing.com wrote:
> From: Adam Young <admiyo@...amperecomputing.com>
>
> Implementation of network driver for
> Management Control Transport Protocol(MCTP)
it is Management Component Transport Protocol (see DMTF spec).
> over Platform Communication Channel(PCC)
>
> DMTF DSP:0292
> https://urldefense.com/v3/__https://www.dmtf.org/sites/default/files/standards/documents/*__;Lw!!ACWV5N9M2RV99hQ!JkW80M1xGJjvaXd72o192mqV0uzOu511ibGTz-JCtmbsM_IrpuZ0jJeeQFOug5UHp8fNY1dX2Lk0_hPUUY__KokwgtE$
> DSP0292_1.0.0WIP50.pdf
>
> MCTP devices are specified via ACPI by entries
> in DSDT/SDST and reference channels specified
SDST -> SSDT
> in the PCCT. Messages are sent on a type 3 and
> received on a type 4 channel. Communication with
> other devices use the PCC based doorbell mechanism;
> a shared memory segment with a corresponding
> interrupt and a memory register used to trigger
> remote interrupts.
>
> This driver takes advantage of PCC mailbox buffer
> management. The data section of the struct sk_buff
> that contains the outgoing packet is sent to the mailbox,
> already properly formatted as a PCC message. The driver
> is also responsible for allocating a struct sk_buff that
> is then passed to the mailbox and used to record the
> data in the shared buffer. It maintains a list of both
> outging and incoming sk_buffs to match the data buffers
outging
>
> When the Type 3 channel outbox receives a txdone response
> interrupt, it consumes the outgoing sk_buff, allowing
> it to be freed.
>
> Bringing the interface up and down creates and frees
> the channel between the network driver and the mailbox
> driver. Freeing the channel also frees any packets that
> are cached in the mailbox ringbuffer.
>
> Signed-off-by: Adam Young <admiyo@...amperecomputing.com>
> ---
> MAINTAINERS | 5 +
> drivers/net/mctp/Kconfig | 13 ++
> drivers/net/mctp/Makefile | 1 +
> drivers/net/mctp/mctp-pcc.c | 367 ++++++++++++++++++++++++++++++++++++
> 4 files changed, 386 insertions(+)
> create mode 100644 drivers/net/mctp/mctp-pcc.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bce96dd254b8..de359bddcb2f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -14660,6 +14660,11 @@ F: include/net/mctpdevice.h
> F: include/net/netns/mctp.h
> F: net/mctp/
>
> +MANAGEMENT COMPONENT TRANSPORT PROTOCOL (MCTP) over PCC (MCTP-PCC) Driver
> +M: Adam Young <admiyo@...amperecomputing.com>
> +S: Maintained
> +F: drivers/net/mctp/mctp-pcc.c
> +
> MAPLE TREE
> M: Liam R. Howlett <Liam.Howlett@...cle.com>
> L: maple-tree@...ts.infradead.org
> diff --git a/drivers/net/mctp/Kconfig b/drivers/net/mctp/Kconfig
> index cf325ab0b1ef..f69d0237f058 100644
> --- a/drivers/net/mctp/Kconfig
> +++ b/drivers/net/mctp/Kconfig
> @@ -57,6 +57,19 @@ config MCTP_TRANSPORT_USB
> MCTP-over-USB interfaces are peer-to-peer, so each interface
> represents a physical connection to one remote MCTP endpoint.
>
> +config MCTP_TRANSPORT_PCC
> + tristate "MCTP PCC transport"
> + depends on ACPI
> + help
> + Provides a driver to access MCTP devices over PCC transport,
> + A MCTP protocol network device is created via ACPI for each
> + entry in the DST/SDST that matches the identifier. The Platform
should be DSDT/SSDT ?
> + communication channels are selected from the corresponding
> + entries in the PCCT.
> +
> + Say y here if you need to connect to MCTP endpoints over PCC. To
> + compile as a module, use m; the module will be called mctp-pcc.
> +
> endmenu
>
> endif
> diff --git a/drivers/net/mctp/Makefile b/drivers/net/mctp/Makefile
> index c36006849a1e..2276f148df7c 100644
> --- a/drivers/net/mctp/Makefile
> +++ b/drivers/net/mctp/Makefile
> @@ -1,3 +1,4 @@
> +obj-$(CONFIG_MCTP_TRANSPORT_PCC) += mctp-pcc.o
> obj-$(CONFIG_MCTP_SERIAL) += mctp-serial.o
> obj-$(CONFIG_MCTP_TRANSPORT_I2C) += mctp-i2c.o
> obj-$(CONFIG_MCTP_TRANSPORT_I3C) += mctp-i3c.o
> diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
> new file mode 100644
> index 000000000000..c6578b27c00c
> --- /dev/null
> +++ b/drivers/net/mctp/mctp-pcc.c
> @@ -0,0 +1,367 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * mctp-pcc.c - Driver for MCTP over PCC.
> + * Copyright (c) 2024-2025, Ampere Computing LLC
> + *
> + */
> +
> +/* Implementation of MCTP over PCC DMTF Specification DSP0256
DSP0256 vs DSP0292 mismatch
> + * https://urldefense.com/v3/__https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf__;!!ACWV5N9M2RV99hQ!JkW80M1xGJjvaXd72o192mqV0uzOu511ibGTz-JCtmbsM_IrpuZ0jJeeQFOug5UHp8fNY1dX2Lk0_hPUUY__EEa7amg$
> + */
> +
> +#include <linux/acpi.h>
> +#include <linux/if_arp.h>
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/mailbox_client.h>
> +#include <linux/module.h>
> +#include <linux/netdevice.h>
> +#include <linux/platform_device.h>
> +#include <linux/string.h>
> +#include <linux/skbuff.h>
> +#include <linux/hrtimer.h>
> +
> +#include <acpi/acpi_bus.h>
> +#include <acpi/acpi_drivers.h>
> +#include <acpi/acrestyp.h>
> +#include <acpi/actbl.h>
> +#include <net/mctp.h>
> +#include <net/mctpdevice.h>
> +#include <acpi/pcc.h>
> +
> +#define MCTP_SIGNATURE "MCTP"
> +#define MCTP_SIGNATURE_LENGTH (sizeof(MCTP_SIGNATURE) - 1)
> +#define MCTP_MIN_MTU 68
> +#define PCC_DWORD_TYPE 0x0c
> +
> +struct mctp_pcc_mailbox {
> + u32 index;
> + struct pcc_mbox_chan *chan;
> + struct mbox_client client;
> + struct sk_buff_head packets;
> +};
> +
> +/* The netdev structure. One of these per PCC adapter. */
> +struct mctp_pcc_ndev {
> + struct net_device *ndev;
> + struct acpi_device *acpi_device;
> + struct mctp_pcc_mailbox inbox;
> + struct mctp_pcc_mailbox outbox;
> +};
> +
> +static void *mctp_pcc_rx_alloc(struct mbox_client *c, int size)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct mctp_pcc_mailbox *box;
> + struct sk_buff *skb;
> +
> + mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
> + box = &mctp_pcc_ndev->inbox;
> +
> + if (size > mctp_pcc_ndev->ndev->mtu)
> + return NULL;
> + skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
> + if (!skb)
> + return NULL;
> + skb_put(skb, size);
> + skb->protocol = htons(ETH_P_MCTP);
> + skb_queue_head(&box->packets, skb);
> +
> + return skb->data;
> +}
> +
> +static void mctp_pcc_client_rx_callback(struct mbox_client *c, void *buffer)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct sk_buff *curr_skb = NULL;
> + struct pcc_header pcc_header;
> + struct sk_buff *skb = NULL;
> + struct mctp_skb_cb *cb;
> +
> + mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
> + if (!buffer) {
> + dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> + return;
> + }
> +
> + spin_lock(&mctp_pcc_ndev->inbox.packets.lock);
> + skb_queue_walk(&mctp_pcc_ndev->inbox.packets, curr_skb) {
> + skb = curr_skb;
> + if (skb->data != buffer)
> + continue;
> + __skb_unlink(skb, &mctp_pcc_ndev->inbox.packets);
> + break;
> + }
> + spin_unlock(&mctp_pcc_ndev->inbox.packets.lock);
> +
> + if (skb) {
> + dev_dstats_rx_add(mctp_pcc_ndev->ndev, skb->len);
> + skb_reset_mac_header(skb);
> + skb_pull(skb, sizeof(pcc_header));
> + skb_reset_network_header(skb);
> + cb = __mctp_cb(skb);
> + cb->halen = 0;
> + netif_rx(skb);
> + }
> +}
> +
> +static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct mctp_pcc_mailbox *box;
> + struct sk_buff *skb = NULL;
> + struct sk_buff *curr_skb;
> +
> + mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
> + box = container_of(c, struct mctp_pcc_mailbox, client);
> + spin_lock(&box->packets.lock);
> + skb_queue_walk(&box->packets, curr_skb) {
> + skb = curr_skb;
> + if (skb->data == mssg) {
> + __skb_unlink(skb, &box->packets);
> + break;
> + }
> + }
> + spin_unlock(&box->packets.lock);
> +
> + if (skb)
> + dev_consume_skb_any(skb);
> +}
> +
> +static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
> +{
> + struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
> + struct pcc_header *pcc_header;
> + int len = skb->len;
> + int rc;
> +
> + rc = skb_cow_head(skb, sizeof(*pcc_header));
> + if (rc) {
> + dev_dstats_tx_dropped(ndev);
> + kfree_skb(skb);
> + return NETDEV_TX_OK;
> + }
> +
> + pcc_header = skb_push(skb, sizeof(*pcc_header));
> + pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
> + pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
> + memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
> + pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
> +
> + skb_queue_head(&mpnd->outbox.packets, skb);
> +
> + rc = mbox_send_message(mpnd->outbox.chan->mchan, skb->data);
> +
> + if (rc < 0) {
> + skb_unlink(skb, &mpnd->outbox.packets);
> + return NETDEV_TX_BUSY;
> + }
> +
> + dev_dstats_tx_add(ndev, len);
> + return NETDEV_TX_OK;
> +}
> +
> +static void drain_packets(struct sk_buff_head *list)
> +{
> + struct sk_buff *skb;
> +
> + while (!skb_queue_empty(list)) {
> + skb = skb_dequeue(list);
> + dev_consume_skb_any(skb);
> + }
> +}
> +
> +static int mctp_pcc_ndo_open(struct net_device *ndev)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev =
> + netdev_priv(ndev);
> + struct mctp_pcc_mailbox *outbox =
> + &mctp_pcc_ndev->outbox;
> + struct mctp_pcc_mailbox *inbox =
> + &mctp_pcc_ndev->inbox;
> + int mctp_pcc_mtu;
> +
> + outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
> + if (IS_ERR(outbox->chan))
> + return PTR_ERR(outbox->chan);
> +
> + inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
> + if (IS_ERR(inbox->chan)) {
> + pcc_mbox_free_channel(outbox->chan);
> + return PTR_ERR(inbox->chan);
> + }
> +
> + mctp_pcc_ndev->inbox.chan->rx_alloc = mctp_pcc_rx_alloc;
> + mctp_pcc_ndev->outbox.chan->manage_writes = true;
> +
> + mctp_pcc_mtu = mctp_pcc_ndev->outbox.chan->shmem_size -
> + sizeof(struct pcc_header);
> + ndev->mtu = MCTP_MIN_MTU;
> + ndev->max_mtu = mctp_pcc_mtu;
> + ndev->min_mtu = MCTP_MIN_MTU;
> +
> + return 0;
> +}
> +
> +static int mctp_pcc_ndo_stop(struct net_device *ndev)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev =
> + netdev_priv(ndev);
> + struct mctp_pcc_mailbox *outbox =
> + &mctp_pcc_ndev->outbox;
> + struct mctp_pcc_mailbox *inbox =
> + &mctp_pcc_ndev->inbox;
> +
> + pcc_mbox_free_channel(outbox->chan);
> + pcc_mbox_free_channel(inbox->chan);
> +
> + drain_packets(&mctp_pcc_ndev->outbox.packets);
> + drain_packets(&mctp_pcc_ndev->inbox.packets);
> + return 0;
> +}
> +
> +static const struct net_device_ops mctp_pcc_netdev_ops = {
> + .ndo_open = mctp_pcc_ndo_open,
> + .ndo_stop = mctp_pcc_ndo_stop,
> + .ndo_start_xmit = mctp_pcc_tx,
> +
> +};
> +
> +static void mctp_pcc_setup(struct net_device *ndev)
> +{
> + ndev->type = ARPHRD_MCTP;
> + ndev->hard_header_len = 0;
> + ndev->tx_queue_len = 0;
> + ndev->flags = IFF_NOARP;
> + ndev->netdev_ops = &mctp_pcc_netdev_ops;
> + ndev->needs_free_netdev = true;
> + ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
> +}
> +
> +struct mctp_pcc_lookup_context {
> + int index;
> + u32 inbox_index;
> + u32 outbox_index;
> +};
> +
> +static acpi_status lookup_pcct_indices(struct acpi_resource *ares,
> + void *context)
> +{
> + struct mctp_pcc_lookup_context *luc = context;
> + struct acpi_resource_address32 *addr;
> +
> + if (ares->type != PCC_DWORD_TYPE)
> + return AE_OK;
> +
> + addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
> + switch (luc->index) {
> + case 0:
> + luc->outbox_index = addr[0].address.minimum;
> + break;
> + case 1:
> + luc->inbox_index = addr[0].address.minimum;
> + break;
> + }
> + luc->index++;
> + return AE_OK;
> +}
> +
> +static void mctp_cleanup_netdev(void *data)
> +{
> + struct net_device *ndev = data;
> +
> + mctp_unregister_netdev(ndev);
> +}
> +
> +static int mctp_pcc_initialize_mailbox(struct device *dev,
> + struct mctp_pcc_mailbox *box, u32 index)
> +{
> + box->index = index;
> + skb_queue_head_init(&box->packets);
> + box->client.dev = dev;
> + return 0;
> +}
> +
> +static int mctp_pcc_driver_add(struct acpi_device *acpi_dev)
> +{
> + struct mctp_pcc_lookup_context context = {0};
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct device *dev = &acpi_dev->dev;
> + struct net_device *ndev;
> + acpi_handle dev_handle;
> + acpi_status status;
> + char name[32];
> + int rc;
> +
> + dev_dbg(dev, "Adding mctp_pcc device for HID %s\n",
> + acpi_device_hid(acpi_dev));
> + dev_handle = acpi_device_handle(acpi_dev);
> + status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices,
> + &context);
> + if (!ACPI_SUCCESS(status)) {
> + dev_err(dev, "FAILURE to lookup PCC indexes from CRS\n");
FAILURE to lookup -> failed to lookup
> + return -EINVAL;
> + }
> +
> + snprintf(name, sizeof(name), "mctpipcc%d", context.inbox_index);
mctp_pcc%d ?
> + ndev = alloc_netdev(sizeof(*mctp_pcc_ndev), name, NET_NAME_PREDICTABLE,
> + mctp_pcc_setup);
> + if (!ndev)
> + return -ENOMEM;
> +
> + mctp_pcc_ndev = netdev_priv(ndev);
> +
> + /* inbox initialization */
> + rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->inbox,
> + context.inbox_index);
> + if (rc)
> + goto free_netdev;
> +
> + mctp_pcc_ndev->inbox.client.rx_callback = mctp_pcc_client_rx_callback;
> +
> + /* outbox initialization */
> + rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->outbox,
> + context.outbox_index);
> + if (rc)
> + goto free_netdev;
> +
> + mctp_pcc_ndev->outbox.client.tx_done = mctp_pcc_tx_done;
> + mctp_pcc_ndev->acpi_device = acpi_dev;
> + mctp_pcc_ndev->ndev = ndev;
> + acpi_dev->driver_data = mctp_pcc_ndev;
> +
> + /* ndev needs to be freed before the iomemory (mapped above) gets
> + * unmapped, devm resources get freed in reverse to the order they
> + * are added.
> + */
> + rc = mctp_register_netdev(ndev, NULL, MCTP_PHYS_BINDING_PCC);
> + if (rc)
> + goto free_netdev;
> +
> + return devm_add_action_or_reset(dev, mctp_cleanup_netdev, ndev);
> +free_netdev:
> + free_netdev(ndev);
> + return rc;
> +}
> +
> +static const struct acpi_device_id mctp_pcc_device_ids[] = {
> + { "DMT0001" },
> + {}
> +};
> +
> +static struct acpi_driver mctp_pcc_driver = {
> + .name = "mctp_pcc",
> + .class = "Unknown",
> + .ids = mctp_pcc_device_ids,
> + .ops = {
> + .add = mctp_pcc_driver_add,
> + },
> +};
> +
> +module_acpi_driver(mctp_pcc_driver);
> +
> +MODULE_DEVICE_TABLE(acpi, mctp_pcc_device_ids);
> +
> +MODULE_DESCRIPTION("MCTP PCC ACPI device");
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Adam Young <admiyo@...amperecomputing.com>");
Thanks,
Alok
Powered by blists - more mailing lists