[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a01f2ed55c69fc22dac9c8e5c2e84b557346aa4d.camel@codeconstruct.com.au>
Date: Mon, 14 Jul 2025 16:54:27 +0800
From: Jeremy Kerr <jk@...econstruct.com.au>
To: aspeedyh <yh_chung@...eedtech.com>, matt@...econstruct.com.au,
andrew+netdev@...n.ch, davem@...emloft.net, edumazet@...gle.com,
kuba@...nel.org, pabeni@...hat.com, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, bmc-sw@...eedtech.com
Cc: Khang D Nguyen <khangng@...eremail.onmicrosoft.com>
Subject: Re: [PATCH] net: mctp: Add MCTP PCIe VDM transport driver
Hi YH,
[+CC Khang]
I have some overall questions before we get into a full review of the
series, inline below.
> Add an implementation for DMTF DSP0238 MCTP PCIe VDM transport spec.
>
> Introduce struct mctp_pcie_vdm_dev to represent each PCIe VDM
> interface and its send/receive operations. Register a
> net_device with the MCTP core so packets traverse the standard
> networking socket API.
>
> Because there is no generic PCIe VDM bus framework in-tree, this
> driver provides a transport interface for lower layers to
> implement vendor-specific read/write callbacks.
Do we really need an abstraction for MCTP VDM drivers? How many are you
expecting? Can you point us to a client of the VDM abstraction?
There is some value in keeping consistency for the MCTP lladdr formats
across PCIe transports, but I'm not convinced we need a whole
abstraction layer for this.
> TX path uses a dedicated kernel thread and ptr_ring: skbs queued by the
> MCTP stack are enqueued on the ring and processed in-thread context.
Is this somehow more suitable than the existing netdev queues?
> +struct mctp_pcie_vdm_route_info {
> + u8 eid;
> + u8 dirty;
> + u16 bdf_addr;
> + struct hlist_node hnode;
> +};
Why are you keeping your own routing table in the transport driver? We
already have the route and neighbour tables in the MCTP core code.
Your assumption that you can intercept MCTP control messages to keep a
separate routing table will not work.
> +static void mctp_pcie_vdm_net_setup(struct net_device *ndev)
> +{
> + ndev->type = ARPHRD_MCTP;
> +
> + ndev->mtu = MCTP_PCIE_VDM_MIN_MTU;
> + ndev->min_mtu = MCTP_PCIE_VDM_MIN_MTU;
> + ndev->max_mtu = MCTP_PCIE_VDM_MAX_MTU;
> + ndev->tx_queue_len = MCTP_PCIE_VDM_NET_DEV_TX_QUEUE_LEN;
> + ndev->addr_len = 2; //PCIe bdf is 2 bytes
One of the critical things to get right is the lladdr format for PCIe
VDM devices, as it will be visible to userspace. While the PCIe b/d/fn
data is indeed two bytes, the MCTP addressing format for VDM data is not
entirely representable in two bytes: we also have the routing type to
encode. DSP0238 requires us to use Route to Root Complex and Broadcast
from Root Complex for certain messages, so we need some way to represent
that in your proposed lladdr format.
For this reason, you may want to encode this as [type, bdfn] data.
Also, in flit mode, we also have the segment byte to encode.
Cheers,
Jeremy
Powered by blists - more mailing lists