lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3554c773862e48c60fc07c4c0a09ef6ace422840.camel@gmail.com>
Date:   Tue, 27 Nov 2018 16:26:53 -0800
From:   Dalon Westergreen <dwesterg@...il.com>
To:     thor.thayer@...ux.intel.com, netdev@...r.kernel.org,
        dinguyen@...nel.org, dalon.westergreen@...ux.intel.com
Subject: Re: [PATCH net-next 7/8] net: eth: altera: tse: add msgdma
 prefetcher

On Fri, 2018-11-16 at 09:20 -0600, Thor Thayer wrote:
> Hi Dalon,
> 
> Just a few comments/questions.
> 
> On 11/14/18 6:50 PM, Dalon Westergreen wrote:
> > From: Dalon Westergreen <dalon.westergreen@...el.com>
> > 
> > Add support for the mSGDMA prefetcher.  The prefetcher adds support
> > for a linked list of descriptors in system memory.  The prefetcher
> > feeds these to the mSGDMA dispatcher.
> > 
> > The prefetcher is configured to poll for the next descriptor in the
> > list to be owned by hardware, then pass the descriptor to the
> > dispatcher.  It will then poll the next descriptor until it is
> > owned by hardware.
> > 
> > The dispatcher responses are written back to the appropriate
> > descriptor, and the owned by hardware bit is cleared.
> > 
> > The driver sets up a linked list twice the tx and rx ring sizes,
> > with the last descriptor pointing back to the first.  This ensures
> > that the ring of descriptors will always have inactive descriptors
> > preventing the prefetcher from looping over and reusing descriptors
> > inappropriately.  The prefetcher will continuously loop over these
> > descriptors.  The driver modifies descriptors as required to update
> > the skb address and length as well as the owned by hardware bit.
> > 
> > In addition to the above, the mSGDMA prefetcher can be used to
> > handle rx and tx timestamps coming from the ethernet ip.  These
> > can be included in the prefetcher response in the descriptor.
> > 
> > Signed-off-by: Dalon Westergreen <dalon.westergreen@...el.com>
> > ---
> >   drivers/net/ethernet/altera/Makefile          |   2 +-
> >   .../altera/altera_msgdma_prefetcher.c         | 433 ++++++++++++++++++
> >   .../altera/altera_msgdma_prefetcher.h         |  30 ++
> >   .../altera/altera_msgdmahw_prefetcher.h       |  87 ++++
> >   drivers/net/ethernet/altera/altera_tse.h      |  14 +
> >   drivers/net/ethernet/altera/altera_tse_main.c |  51 +++
> >   6 files changed, 616 insertions(+), 1 deletion(-)
> >   create mode 100644 drivers/net/ethernet/altera/altera_msgdma_prefetcher.c
> >   create mode 100644 drivers/net/ethernet/altera/altera_msgdma_prefetcher.h
> >   create mode 100644
> > drivers/net/ethernet/altera/altera_msgdmahw_prefetcher.h
> > 
> > diff --git a/drivers/net/ethernet/altera/Makefile
> > b/drivers/net/ethernet/altera/Makefile
> > index ad80be42fa26..73b32876f126 100644
> > --- a/drivers/net/ethernet/altera/Makefile
> > +++ b/drivers/net/ethernet/altera/Makefile
> > @@ -5,4 +5,4 @@
> >   obj-$(CONFIG_ALTERA_TSE) += altera_tse.o
> >   altera_tse-objs := altera_tse_main.o altera_tse_ethtool.o \
> >   		   altera_msgdma.o altera_sgdma.o altera_utils.o \
> > -		   altera_ptp.o
> > +		   altera_ptp.o altera_msgdma_prefetcher.o
> > diff --git a/drivers/net/ethernet/altera/altera_msgdma_prefetcher.c
> > b/drivers/net/ethernet/altera/altera_msgdma_prefetcher.c
> > new file mode 100644
> > index 000000000000..55b475e9e15b
> > --- /dev/null
> > +++ b/drivers/net/ethernet/altera/altera_msgdma_prefetcher.c
> > @@ -0,0 +1,433 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* MSGDMA Prefetcher driver for Altera ethernet devices
> > + *
> > + * Copyright (C) 2018 Intel Corporation. All rights reserved.
> > + * Author(s):
> > + *   Dalon Westergreen <dalon.westergreen@...el.com>
> > + */
> > +
> > +#include <linux/list.h>
> > +#include <linux/netdevice.h>
> > +#include <linux/net_tstamp.h>
> > +#include "altera_utils.h"
> > +#include "altera_tse.h"
> > +#include "altera_msgdma.h"
> > +#include "altera_msgdmahw.h"
> > +#include "altera_msgdma_prefetcher.h"
> > +#include "altera_msgdmahw_prefetcher.h"
> 
> These could be alphabetized - tse and utils at the end.

sure thing.

> > +
> > +int msgdma_pref_initialize(struct altera_tse_private *priv)
> > +{
> > +	int i;
> > +	struct msgdma_pref_extended_desc *rx_descs;
> > +	struct msgdma_pref_extended_desc *tx_descs;
> > +	dma_addr_t rx_descsphys;
> > +	dma_addr_t tx_descsphys;
> > +	u32 rx_ring_size;
> > +	u32 tx_ring_size;
> > +
> > +	priv->pref_rxdescphys = (dma_addr_t)0;
> > +	priv->pref_txdescphys = (dma_addr_t)0;
> > +
> > +	/* we need to allocate more pref descriptors than ringsize, for now
> > +	 * just double ringsize
> > +	 */
> > +	rx_ring_size = priv->rx_ring_size * 2;
> > +	tx_ring_size = priv->tx_ring_size * 2;
> > +
> > +	/* The prefetcher requires the descriptors to be aligned to the
> > +	 * descriptor read/write master's data width which worst case is
> > +	 * 512 bits.  Currently we DO NOT CHECK THIS and only support 32-bit
> > +	 * prefetcher masters.
> > +	 */
> > +
> > +	/* allocate memory for rx descriptors */
> > +	priv->pref_rxdesc =
> > +		dma_zalloc_coherent(priv->device,
> > +				    sizeof(struct msgdma_pref_extended_desc)
> > +				    * rx_ring_size,
> > +				    &priv->pref_rxdescphys, GFP_KERNEL);
> > +
> > +	if (!priv->pref_rxdesc)
> > +		goto err_rx;
> > +
> > +	/* allocate memory for tx descriptors */
> > +	priv->pref_txdesc =
> > +		dma_zalloc_coherent(priv->device,
> > +				    sizeof(struct msgdma_pref_extended_desc)
> > +				    * tx_ring_size,
> > +				    &priv->pref_txdescphys, GFP_KERNEL);
> > +
> > +	if (!priv->pref_txdesc)
> > +		goto err_tx;
> > +
> > +	/* setup base descriptor ring for tx & rx */
> > +	rx_descs = (struct msgdma_pref_extended_desc *)priv->pref_rxdesc;
> > +	tx_descs = (struct msgdma_pref_extended_desc *)priv->pref_txdesc;
> > +	tx_descsphys = priv->pref_txdescphys;
> > +	rx_descsphys = priv->pref_rxdescphys;
> > +
> > +	/* setup RX descriptors */
> > +	priv->pref_rx_prod = 0;
> > +	for (i = 0; i < rx_ring_size; i++) {
> > +		rx_descsphys = priv->pref_rxdescphys +
> > +			(((i + 1) % rx_ring_size) *
> > +			sizeof(struct msgdma_pref_extended_desc));
> > +		rx_descs[i].next_desc_lo = lower_32_bits(rx_descsphys);
> > +		rx_descs[i].next_desc_hi = upper_32_bits(rx_descsphys);
> > +		rx_descs[i].stride = MSGDMA_DESC_RX_STRIDE;
> > +		/* burst set to 0 so it defaults to max configured */
> > +		/* set seq number to desc number */
> > +		rx_descs[i].burst_seq_num = i;
> > +	}
> > +
> > +	/* setup TX descriptors */
> > +	for (i = 0; i < tx_ring_size; i++) {
> > +		tx_descsphys = priv->pref_txdescphys +
> > +			(((i + 1) % tx_ring_size) *
> > +			sizeof(struct msgdma_pref_extended_desc));
> > +		tx_descs[i].next_desc_lo = lower_32_bits(tx_descsphys);
> > +		tx_descs[i].next_desc_hi = upper_32_bits(tx_descsphys);
> > +		tx_descs[i].stride = MSGDMA_DESC_TX_STRIDE;
> > +		/* burst set to 0 so it defaults to max configured */
> > +		/* set seq number to desc number */
> > +		tx_descs[i].burst_seq_num = i;
> > +	}
> > +
> > +	if (netif_msg_ifup(priv))
> > +		netdev_info(priv->dev, "%s: RX Desc mem at 0x%x\n", __func__,
> > +			    priv->pref_rxdescphys);
> > +
> > +	if (netif_msg_ifup(priv))
> > +		netdev_info(priv->dev, "%s: TX Desc mem at 0x%x\n", __func__,
> > +			    priv->pref_txdescphys);
> > +
> > +	return 0;
> > +
> > +err_tx:
> > +	dma_free_coherent(priv->device,
> > +			  sizeof(struct msgdma_pref_extended_desc)
> > +			  * rx_ring_size,
> > +			  priv->pref_rxdesc, priv->pref_rxdescphys);
> > +err_rx:
> > +	return -ENOMEM;
> > +}
> > +
> > +void msgdma_pref_uninitialize(struct altera_tse_private *priv)
> > +{
> > +	if (priv->pref_rxdesc)
> > +		dma_free_coherent(priv->device,
> > +				  sizeof(struct msgdma_pref_extended_desc)
> > +				  * priv->rx_ring_size * 2,
> > +				  priv->pref_rxdesc, priv->pref_rxdescphys);
> > +
> > +	if (priv->pref_txdesc)
> > +		dma_free_coherent(priv->device,
> > +				  sizeof(struct msgdma_pref_extended_desc)
> > +				  * priv->rx_ring_size * 2,
> > +				  priv->pref_txdesc, priv->pref_txdescphys);
> 
> Why does this have the ring_size*2 but the error path in 
> msgdma_pref_initialize() above (err_tx path) doesn't?

The msgdma_pref_initialize uses a local rx_ring_size which is already 2 *
priv->rx_ring_size.  i'll just make it more clear and fix the other error
above where priv->rx_ring_size is being used for the tx_ring_size.

> 
> > +}
> > +
> > +void msgdma_pref_enable_txirq(struct altera_tse_private *priv)
> > +{
> > +	tse_set_bit(priv->tx_pref_csr, msgdma_pref_csroffs(control),
> > +		    MSGDMA_PREF_CTL_GLOBAL_INTR);
> > +}
> > +
> 
> <snip>
> 
> > +
> > +void msgdma_pref_reset(struct altera_tse_private *priv)
> > +{
> > +	int counter;
> > +
> > +	/* turn off polling */
> > +	tse_clear_bit(priv->rx_pref_csr, msgdma_pref_csroffs(control),
> > +		      MSGDMA_PREF_CTL_DESC_POLL_EN);
> > +	tse_clear_bit(priv->tx_pref_csr, msgdma_pref_csroffs(control),
> > +		      MSGDMA_PREF_CTL_DESC_POLL_EN);
> > +
> > +	/* Reset the RX Prefetcher */
> > +	csrwr32(MSGDMA_PREF_STAT_IRQ, priv->rx_pref_csr,
> > +		msgdma_pref_csroffs(status));
> > +	csrwr32(MSGDMA_PREF_CTL_RESET, priv->rx_pref_csr,
> > +		msgdma_pref_csroffs(control));
> > +
> > +	counter = 0;
> > +	while (counter++ < ALTERA_TSE_SW_RESET_WATCHDOG_CNTR) {
> > +		if (tse_bit_is_clear(priv->rx_pref_csr,
> > +				     msgdma_pref_csroffs(control),
> > +				     MSGDMA_PREF_CTL_RESET))
> > +			break;
> > +		udelay(1);
> > +	}
> > +
> > +	if (counter >= ALTERA_TSE_SW_RESET_WATCHDOG_CNTR)
> > +		netif_warn(priv, drv, priv->dev,
> > +			   "TSE Rx Prefetcher reset bit never cleared!\n");
> > +
> I take it there are no negative consequences for the reset bit not 
> clearing? Would it be useful to return an error?

I will look into what i can do with this, it is no different than the
currently implemented msgdma code, not that that is an excuse. 

> 
> > +	/* clear all status bits */
> > +	csrwr32(MSGDMA_PREF_STAT_IRQ, priv->tx_pref_csr,
> > +		msgdma_pref_csroffs(status));
> > +
> 
> This looks the same as below. Are they the same or did I miss something?

nope, that's an error. thanks.

> 
> > +	/* Reset the TX Prefetcher */
> > +	csrwr32(MSGDMA_PREF_STAT_IRQ, priv->tx_pref_csr,
> > +		msgdma_pref_csroffs(status));
> > +	csrwr32(MSGDMA_PREF_CTL_RESET, priv->tx_pref_csr,
> > +		msgdma_pref_csroffs(control));
> > +
> > +	counter = 0;
> > +	while (counter++ < ALTERA_TSE_SW_RESET_WATCHDOG_CNTR) {
> > +		if (tse_bit_is_clear(priv->tx_pref_csr,
> > +				     msgdma_pref_csroffs(control),
> > +				     MSGDMA_PREF_CTL_RESET))
> > +			break;
> > +		udelay(1);
> > +	}
> > +
> > +	if (counter >= ALTERA_TSE_SW_RESET_WATCHDOG_CNTR)
> > +		netif_warn(priv, drv, priv->dev,
> > +			   "TSE Tx Prefetcher reset bit never cleared!\n");
> > +
> Same as above.
> 
> > +	/* clear all status bits */
> > +	csrwr32(MSGDMA_PREF_STAT_IRQ, priv->tx_pref_csr,
> > +		msgdma_pref_csroffs(status));
> > +
> > +	/* Reset mSGDMA dispatchers*/
> > +	msgdma_reset(priv);
> > +}
> > +
> 
> <snip>
> 
> > +
> > +/* Add MSGDMA Prefetcher Descriptor to descriptor list
> > + *   -> This should never be called when a descriptor isn't available
> > + */
> > +void msgdma_pref_add_rx_desc(struct altera_tse_private *priv,
> > +			     struct tse_buffer *rxbuffer)
> > +{
> > +	struct msgdma_pref_extended_desc *rx_descs = priv->pref_rxdesc;
> > +	u32 desc_entry = priv->pref_rx_prod % (priv->rx_ring_size * 2);
> > +
> > +	/* write descriptor entries */
> > +	rx_descs[desc_entry].len = priv->rx_dma_buf_sz;
> > +	rx_descs[desc_entry].write_addr_lo = lower_32_bits(rxbuffer->dma_addr);
> > +	rx_descs[desc_entry].write_addr_hi = upper_32_bits(rxbuffer->dma_addr);
> > +
> > +	/* set the control bits and set owned by hw */
> > +	rx_descs[desc_entry].desc_control = (MSGDMA_DESC_CTL_END_ON_EOP
> > +			| MSGDMA_DESC_CTL_END_ON_LEN
> > +			| MSGDMA_DESC_CTL_TR_COMP_IRQ
> > +			| MSGDMA_DESC_CTL_EARLY_IRQ
> > +			| MSGDMA_DESC_CTL_TR_ERR_IRQ
> > +			| MSGDMA_DESC_CTL_GO
> > +			| MSGDMA_PREF_DESC_CTL_OWNED_BY_HW);
> > +
> > +	/* we need to keep a separate one for rx as RX_DESCRIPTORS are
> > +	 * pre-configured at startup
> > +	 */
> > +	priv->pref_rx_prod++;
> 
> Can you explain more in the comment? What is "one"?

i will clean up the comment.  The rx path keeps it's own copy
of pref_rx_prod as the pref ring size is 2 * tse ring size.  I'll
also see how tricky it is to get rid of this copy.

> 
> > +
> > +	if (netif_msg_rx_status(priv)) {
> > +		netdev_info(priv->dev, "%s: desc: %d buf: %d control 0x%x\n",
> > +			    __func__, desc_entry,
> > +			    priv->rx_prod % priv->rx_ring_size,
> > +			    priv->pref_rxdesc[desc_entry].desc_control);
> > +	}
> > +}
> > +
> 
> <snip>
> 
> Thanks for the patches!

Nope, thank you for looking them over.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ