lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <59262ef3dd094777175dedae9f864f747947f50d.camel@codeconstruct.com.au>
Date: Tue, 14 May 2024 13:24:26 +0800
From: Jeremy Kerr <jk@...econstruct.com.au>
To: admiyo@...amperecomputing.com, Matt Johnston
 <matt@...econstruct.com.au>,  "David S. Miller" <davem@...emloft.net>, Eric
 Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo
 Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] mctp pcc: Implement MCTP over PCC Transport

Hi Adam,

Nice work on sending this out.

I have some comments inline, but in general, +1 for others' feedback on
formatting and warning checks there. The checkpatch, NIPA and W=1 checks
should help out with addressing those, so I'll focus more on the MCTP /
net side.

Some of these are just my unfamiliarity with the hardware mechanism,
some are more fixes.

> Implementation of DMTF DSP:0292
> Management Control Transport Protocol(MCTP)  over
> Platform Communication Channel(PCC)

Could you add a bit on the platform support / requirements here (or
maybe in the Kconfig help)? I figure this is entirely ACPI-based, is
that correct?

> --- a/drivers/net/mctp/Kconfig
> +++ b/drivers/net/mctp/Kconfig
> @@ -42,6 +42,18 @@ config MCTP_TRANSPORT_I3C
>           A MCTP protocol network device is created for each I3C bus
>           having a "mctp-controller" devicetree property.
>  
> +config MCTP_TRANSPORT_PCC
> +       tristate "MCTP  PCC transport"
> +       select MCTP_FLOWS

Do you need MCTP_FLOWS here? I can't see any reference to using the flow
data.

Also, you probably need a dependency on ACPI.

> +       help
> +         Provides a driver to access MCTP devices over PCC transport,
> +         A MCTP protocol network device is created for each entry int the DST/SDST

s/int/in/

> +         that matches the identifier.  The PCC channels are selected from the
> +         corresponding entries in the PCCT.
> +
> +         Say y here if you need to connect to MCTP endpoints over PCC. To
> +         compile as a module, use m; the module will be called mctp-pcc.
> +
>  endmenu
>  
>  endif

[...]

> diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
> new file mode 100644
> index 000000000000..7242eedd2759
> --- /dev/null
> +++ b/drivers/net/mctp/mctp-pcc.c
> @@ -0,0 +1,368 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * mctp_pcc.c - Driver for MCTP over PCC.

Minor: mctp-pcc.c is the filename.

> +struct mctp_pcc_hdr {
> +       u32 signature;
> +       u32  flags;
> +       u32 length;
> +       char mctp_signature[4];
> +};
> +
> +struct mctp_pcc_packet {
> +       struct mctp_pcc_hdr pcc_header;
> +       union {
> +               struct mctp_hdr     mctp_header;
> +               unsigned char header_data[sizeof(struct mctp_hdr)];

Is this worth the union? You're really referencing the header + body
from your memcpy_toio(x->header_data), in which case you'd want the
payload covered by that pointer too.

Might be easier to just take the addresses as required, or you could
drop this struct altogether (see below...)

> +       };
> +       unsigned char payload[MCTP_PAYLOAD_LENGTH];
> +};
> +
> +struct mctp_pcc_hw_addr {
> +       int inbox_index;
> +       int outbox_index;
> +};
> +
> +/* The netdev structure. One of these per PCC adapter. */
> +struct mctp_pcc_ndev {
> +       struct list_head head;

Super minor, but this isn't the head of the list; better to just call
this something like 'list' or 'entry' or 'node'.

> +       /* spinlock to serialize access from netdev to pcc buffer*/
> +       spinlock_t lock;

I'd suggest listing which members should be protected by that lock.
>
> +       struct mctp_dev mdev;
> +       struct acpi_device *acpi_device;
> +       struct pcc_mbox_chan *in_chan;
> +       struct pcc_mbox_chan *out_chan;
> +       struct mbox_client outbox_client;
> +       struct mbox_client inbox_client;
> +       void __iomem *pcc_comm_inbox_addr;
> +       void __iomem *pcc_comm_outbox_addr;
> +       struct mctp_pcc_hw_addr hw_addr;
> +       void (*cleanup_channel)(struct pcc_mbox_chan *in_chan);
> +};
> +
> +static struct list_head mctp_pcc_ndevs;
> +
> +static struct mctp_pcc_packet *mctp_pcc_extract_data(struct sk_buff *old_skb,
> +                                                    void *buffer, int outbox_index)

You're mixing address spaces a little here. Looks like buffer should be
void __iomem *, given you're using the io accessors in the function?

[maybe do a sparse build too, to catch these]

This seems to be sending the packet to hardware, so extract_data seems
an odd name for the function.

> +{
> +       struct mctp_pcc_packet *mpp;
> +
> +       mpp = buffer;
> +       writel(PCC_MAGIC | outbox_index, &mpp->pcc_header.signature);
> +       writel(0x1, &mpp->pcc_header.flags);
> +       memcpy_toio(mpp->pcc_header.mctp_signature, MCTP_SIGNATURE, SIGNATURE_LENGTH);
> +       writel(old_skb->len + SIGNATURE_LENGTH,  &mpp->pcc_header.length);
> +       memcpy_toio(mpp->header_data,    old_skb->data, old_skb->len);

You're writing the pcc_header fields individually, then writing the rest
of the mctp header + payload at once. Maybe it would make more sense to
drop struct mctp_pcc_packet, write a mctp_pcc_hdr, then write the packet
data?

> +       return mpp;

So this is just returning the input argument, is that necessary?

> +static void mctp_pcc_client_rx_callback(struct mbox_client *c, void *)
> +{
> +       struct sk_buff *skb;
> +       struct mctp_pcc_packet *mpp;
> +       struct mctp_skb_cb *cb;
> +       int data_len;
> +       unsigned long buf_ptr_val;
> +       struct mctp_pcc_ndev *mctp_pcc_dev = container_of(c, struct mctp_pcc_ndev, inbox_client);
> +       void *skb_buf;
> +
> +       mpp = (struct mctp_pcc_packet *)mctp_pcc_dev->pcc_comm_inbox_addr;
> +       buf_ptr_val = (unsigned long)mpp;
> +       data_len = readl(&mpp->pcc_header.length) + MCTP_HEADER_LENGTH;
> +       skb = netdev_alloc_skb(mctp_pcc_dev->mdev.dev, data_len);
> +       if (!skb) {
> +               mctp_pcc_dev->mdev.dev->stats.rx_dropped++;
> +               return;
> +       }
> +       skb->protocol = htons(ETH_P_MCTP);
> +       skb_buf = skb_put(skb, data_len);
> +       memcpy_fromio(skb_buf, mpp, data_len);
> +       skb_reset_mac_header(skb);
> +       skb_pull(skb, sizeof(struct mctp_pcc_hdr));

Any reason for including the mpp header in the skb data in the first
place? Not necessarily an issue, but you may want to consider how that
can be symmetrical with outgoing packets, and whether it's represented
in dev->hard_header_len.

My guess is that the mpp header doesn't need to be in the skb at all, as
it seems more like hardware buffer layout. But you may have better
designs!

> +       skb_reset_network_header(skb);
> +       cb = __mctp_cb(skb);
> +       cb->halen = 0;
> +       skb->dev =  mctp_pcc_dev->mdev.dev;
> +       netif_rx(skb);
> +}
> +
> +static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
> +{
> +       unsigned char *buffer;
> +       struct mctp_pcc_ndev *mpnd;
> +       struct mctp_pcc_packet  *mpp;
> +       unsigned long flags;
> +       int rc;
> +
> +       netif_stop_queue(ndev);
> +       ndev->stats.tx_bytes += skb->len;

stats.tx_packets too?

and what about the failure case?

> +       mpnd = (struct mctp_pcc_ndev *)netdev_priv(ndev);
> +       spin_lock_irqsave(&mpnd->lock, flags);
> +       buffer =  mpnd->pcc_comm_outbox_addr;
> +       mpp = mctp_pcc_extract_data(skb, mpnd->pcc_comm_outbox_addr, mpnd->hw_addr.outbox_index);
> +       rc = mpnd->out_chan->mchan->mbox->ops->send_data(mpnd->out_chan->mchan, mpp);
> +       spin_unlock_irqrestore(&mpnd->lock, flags);
> +
> +       dev_consume_skb_any(skb);
> +       netif_start_queue(ndev);

This all looks fairly atomic, do we need to stop / start the queues here?

Or: do we know that the IO region is safe to re-use immediately after
send_data has returned? I'm unfamiliar with the mailbox interface.

> +       if (!rc)
> +               return NETDEV_TX_OK;
> +       return NETDEV_TX_BUSY;
> +}
> +
> +static const struct net_device_ops mctp_pcc_netdev_ops = {
> +       .ndo_start_xmit = mctp_pcc_tx,
> +       .ndo_uninit = NULL
> +};
> +
> +static void  mctp_pcc_setup(struct net_device *ndev)
> +{
> +       ndev->type = ARPHRD_MCTP;
> +       ndev->hard_header_len = 0;
> +       ndev->addr_len = sizeof(struct mctp_pcc_hw_addr);

Do you have any physical addressing on the PCC channels? Or is it
point-to-point?

If it's point to point, you probably don't need the device hardware
addresses at all. Those seem to be synthetic in this driver, rather than
based on a specified addressing scheme. In which case, it may be better
to *not* include the hw addr on the device.

> +       ndev->tx_queue_len = DEFAULT_TX_QUEUE_LEN;
> +       ndev->flags = IFF_NOARP;
> +       ndev->netdev_ops = &mctp_pcc_netdev_ops;
> +       ndev->needs_free_netdev = true;
> +}
> +
> +static int create_mctp_pcc_netdev(struct acpi_device *acpi_dev,
> +                                 struct device *dev, int inbox_index,
> +                                 int outbox_index)
> +{
> +       int rc;
> +       int mctp_pcc_mtu;
> +       char name[32];
> +       struct net_device *ndev;
> +       struct mctp_pcc_ndev *mctp_pcc_dev;
> +       struct mctp_pcc_hw_addr physical_link_addr;
> +
> +       snprintf(name, sizeof(name), "mctpipcc%x", inbox_index);

What does the 'i' stand for? ('ipcc'?)

Granted, this is probably more readable than mctppcc :D

.. and it's convention to use %d there rather than %x, lest you get
another 'c' from the twelfth index.

> +       ndev = alloc_netdev(sizeof(struct mctp_pcc_ndev), name, NET_NAME_ENUM, mctp_pcc_setup);
> +       if (!ndev)
> +               return -ENOMEM;
> +       mctp_pcc_dev = (struct mctp_pcc_ndev *)netdev_priv(ndev);
> +       INIT_LIST_HEAD(&mctp_pcc_dev->head);
> +       spin_lock_init(&mctp_pcc_dev->lock);
> +
> +       mctp_pcc_dev->outbox_client.tx_prepare = NULL;
> +       mctp_pcc_dev->outbox_client.tx_done = NULL;

These will already be zeroed.

> +       mctp_pcc_dev->hw_addr.inbox_index = inbox_index;
> +       mctp_pcc_dev->hw_addr.outbox_index = outbox_index;
> +       mctp_pcc_dev->inbox_client.rx_callback = mctp_pcc_client_rx_callback;
> +       mctp_pcc_dev->cleanup_channel = pcc_mbox_free_channel;
> +       mctp_pcc_dev->out_chan =
> +               pcc_mbox_request_channel(&mctp_pcc_dev->outbox_client,
> +                                        outbox_index);
> +       if (IS_ERR(mctp_pcc_dev->out_chan)) {
> +               rc = PTR_ERR(mctp_pcc_dev->out_chan);
> +               goto free_netdev;
> +       }
> +       mctp_pcc_dev->in_chan =
> +               pcc_mbox_request_channel(&mctp_pcc_dev->inbox_client,
> +                                        inbox_index);
> +       if (IS_ERR(mctp_pcc_dev->in_chan)) {
> +               rc = PTR_ERR(mctp_pcc_dev->in_chan);
> +               goto cleanup_out_channel;
> +       }
> +       mctp_pcc_dev->pcc_comm_inbox_addr =
> +               devm_ioremap(dev, mctp_pcc_dev->in_chan->shmem_base_addr,
> +                            mctp_pcc_dev->in_chan->shmem_size);
> +       if (!mctp_pcc_dev->pcc_comm_inbox_addr) {
> +               rc = -EINVAL;
> +               goto cleanup_in_channel;
> +       }
> +       mctp_pcc_dev->pcc_comm_outbox_addr =
> +               devm_ioremap(dev, mctp_pcc_dev->out_chan->shmem_base_addr,
> +                            mctp_pcc_dev->out_chan->shmem_size);
> +       if (!mctp_pcc_dev->pcc_comm_outbox_addr) {
> +               rc = -EINVAL;
> +               goto cleanup_in_channel;
> +       }
> +       mctp_pcc_dev->acpi_device = acpi_dev;
> +       mctp_pcc_dev->inbox_client.dev = dev;
> +       mctp_pcc_dev->outbox_client.dev = dev;
> +       mctp_pcc_dev->mdev.dev = ndev;
> +
> +/* There is no clean way to pass the MTU to the callback function
> + * used for registration, so set the values ahead of time.
> + */
> +       mctp_pcc_mtu = mctp_pcc_dev->out_chan->shmem_size -
> +               sizeof(struct mctp_pcc_hdr);
> +       ndev->mtu = mctp_pcc_mtu;

So there's a bit of compatibility consideration here. Is it guaranteed
that the other end of this link supports the MTU you're initially
setting here? If so, defaulting to that size should be fine.

However, if *not*: I'd suggest defaulting to the MCTP baseline MTU (68),
and allowing upper layers to increase once we have established the
capabilities of the remote endpoint.

> +       ndev->max_mtu = mctp_pcc_mtu;
> +       ndev->min_mtu = MCTP_MIN_MTU;
> +
> +       physical_link_addr.inbox_index =
> +               htonl(mctp_pcc_dev->hw_addr.inbox_index);
> +       physical_link_addr.outbox_index =
> +               htonl(mctp_pcc_dev->hw_addr.outbox_index);
> +       dev_addr_set(ndev, (const u8 *)&physical_link_addr);

Related to the physical addressing query above; it might be that you
don't need any addresses here.

> +       rc = register_netdev(ndev);
> +       if (rc)
> +               goto cleanup_in_channel;
> +       list_add_tail(&mctp_pcc_dev->head, &mctp_pcc_ndevs);
> +       return 0;
> +cleanup_in_channel:
> +       mctp_pcc_dev->cleanup_channel(mctp_pcc_dev->in_chan);
> +cleanup_out_channel:
> +       mctp_pcc_dev->cleanup_channel(mctp_pcc_dev->out_chan);
> +free_netdev:
> +       unregister_netdev(ndev);
> +       free_netdev(ndev);
> +       return rc;
> +}
> +
> +struct lookup_context {
> +       int index;
> +       int inbox_index;
> +       int outbox_index;
> +};
> +
> +static acpi_status lookup_pcct_indices(struct acpi_resource *ares, void *context)
> +{
> +       struct acpi_resource_address32 *addr;
> +       struct lookup_context *luc = context;
> +
> +       switch (ares->type) {
> +       case 0x0c:
> +       case 0x0a:
> +               break;
> +       default:
> +               return AE_OK;
> +       }
> +
> +       addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
> +       switch (luc->index) {
> +       case 0:
> +               luc->outbox_index = addr[0].address.minimum;
> +               break;
> +       case 1:
> +               luc->inbox_index = addr[0].address.minimum;
> +               break;
> +       }
> +       luc->index++;
> +       return AE_OK;
> +}
> +
> +static int mctp_pcc_driver_add(struct acpi_device *adev)
> +{
> +       int inbox_index;
> +       int outbox_index;
> +       acpi_handle dev_handle;
> +       acpi_status status;
> +       struct lookup_context context = {0, 0, 0};
> +
> +       dev_info(&adev->dev, "Adding mctp_pcc device for HID  %s\n", acpi_device_hid(adev));
> +       dev_handle = acpi_device_handle(adev);
> +       status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices, &context);
> +       if (ACPI_SUCCESS(status)) {
> +               inbox_index = context.inbox_index;
> +               outbox_index = context.outbox_index;
> +               return create_mctp_pcc_netdev(adev, &adev->dev, inbox_index, outbox_index);
> +       }
> +       dev_err(&adev->dev, "FAILURE to lookup PCC indexes from CRS");
> +       return -EINVAL;
> +};

I'd suggest flipping the conditional there, making it the error path:

          if (!ACPI_SUCCESS(status)) {
                  dev_err(&adev->dev, "FAILURE to lookup PCC indexes from CRS");
                  return -EINVAL;
          }

          inbox_index = context.inbox_index;
          outbox_index = context.outbox_index;
          return create_mctp_pcc_netdev(adev, &adev->dev, inbox_index, outbox_index);

> +
> +/* pass in adev=NULL to remove all devices
> + */

Will that ever be needed? You should have all of the devices unbound
before module exit, no?

[not that I have much experience with ACPI at all...]

> +static void mctp_pcc_driver_remove(struct acpi_device *adev)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_dev = NULL;
> +       struct list_head *ptr;
> +       struct list_head *tmp;
> +
> +       list_for_each_safe(ptr, tmp, &mctp_pcc_ndevs) {
> +               mctp_pcc_dev = list_entry(ptr, struct mctp_pcc_ndev, head);
> +               if (!adev || mctp_pcc_dev->acpi_device == adev) {


Possibly tidier to flip the condition here too, leaving fewer indents:

                  if (!adev || mctp_pxx_dev->acpi_device != adev)
                          continue;

> +                       struct net_device *ndev;
> +
> +                       mctp_pcc_dev->cleanup_channel(mctp_pcc_dev->out_chan);
> +                       mctp_pcc_dev->cleanup_channel(mctp_pcc_dev->in_chan);
> +                       ndev = mctp_pcc_dev->mdev.dev;
> +                       if (ndev)
> +                               mctp_unregister_netdev(ndev);
> +                       list_del(ptr);
> +                       if (adev)
> +                               break;
> +               }
> +       }
> +};
> +
> +static const struct acpi_device_id mctp_pcc_device_ids[] = {
> +       { "DMT0001", 0},
> +       { "", 0},
> +};
> +
> +static struct acpi_driver mctp_pcc_driver = {
> +       .name = "mctp_pcc",
> +       .class = "Unknown",
> +       .ids = mctp_pcc_device_ids,
> +       .ops = {
> +               .add = mctp_pcc_driver_add,
> +               .remove = mctp_pcc_driver_remove,
> +               .notify = NULL,
> +       },
> +       .owner = THIS_MODULE,
> +
> +};
> +
> +static int __init mctp_pcc_mod_init(void)
> +{
> +       int rc;
> +
> +       pr_info("initializing MCTP over PCC\n");
> +       INIT_LIST_HEAD(&mctp_pcc_ndevs);
> +       rc = acpi_bus_register_driver(&mctp_pcc_driver);
> +       if (rc < 0)
> +               ACPI_DEBUG_PRINT((ACPI_DB_ERROR, "Error registering driver\n"));
> +       return rc;
> +}
> +
> +static __exit void mctp_pcc_mod_exit(void)
> +{
> +       pr_info("Removing MCTP over PCC transport driver\n");
> +       mctp_pcc_driver_remove(NULL);
> +       acpi_bus_unregister_driver(&mctp_pcc_driver);
> +}
> +
> +module_init(mctp_pcc_mod_init);
> +module_exit(mctp_pcc_mod_exit);

This all looks fairly boilerplate, can you use module_acpi_driver()?
(plus a static initialiser for mctp_pcc_ndevs)

Overall, thanks for the contributions, with a few changes I think the
transport should be good to go.

Cheers,


Jeremy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ