lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 14 Oct 2020 16:08:00 +0000
From:   <Claudiu.Beznea@...rochip.com>
To:     <w@....eu>, <Nicolas.Ferre@...rochip.com>
CC:     <netdev@...r.kernel.org>, <davem@...emloft.net>, <kuba@...nel.org>,
        <daniel@...f.com>
Subject: Re: [PATCH net-next 3/3] macb: support the two tx descriptors on
 at91rm9200

Hi Willy,

On 11.10.2020 12:09, Willy Tarreau wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> The at91rm9200 variant used by a few chips including the MSC313 supports
> two Tx descriptors (one frame being serialized and another one queued).
> However the driver only implemented a single one, which adds a dead time
> after each transfer to receive and process the interrupt and wake the
> queue up, preventing from reaching line rate.
> 
> This patch implements a very basic 2-deep queue to address this limitation.
> The tests run on a Breadbee board equipped with an MSC313E show that at
> 1 GHz, HTTP traffic on medium-sized objects (45kB) was limited to exactly
> 50 Mbps before this patch, and jumped to 76 Mbps with this patch. And tests
> on a single TCP stream with an MTU of 576 jump from 10kpps to 15kpps. With
> 1500 byte packets it's now possible to reach line rate versus 75 Mbps
> before.
> 
> Cc: Nicolas Ferre <nicolas.ferre@...rochip.com>
> Cc: Claudiu Beznea <claudiu.beznea@...rochip.com>
> Cc: Daniel Palmer <daniel@...f.com>
> Signed-off-by: Willy Tarreau <w@....eu>
> ---
>  drivers/net/ethernet/cadence/macb.h      |  2 ++
>  drivers/net/ethernet/cadence/macb_main.c | 46 +++++++++++++++++++-----
>  2 files changed, 40 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
> index e87db95fb0f6..f8133003981f 100644
> --- a/drivers/net/ethernet/cadence/macb.h
> +++ b/drivers/net/ethernet/cadence/macb.h
> @@ -1208,6 +1208,8 @@ struct macb {
> 
>         /* AT91RM9200 transmit queue (1 on wire + 1 queued) */
>         struct macb_tx_skb      rm9200_txq[2];
> +       unsigned int            rm9200_tx_tail;
> +       unsigned int            rm9200_tx_len;
>         unsigned int            max_tx_length;
> 
>         u64                     ethtool_stats[GEM_STATS_LEN + QUEUE_STATS_LEN * MACB_MAX_QUEUES];
> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index ca6e5456906a..6ff8e4b0b95d 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -3909,6 +3909,7 @@ static int at91ether_start(struct macb *lp)
>                              MACB_BIT(ISR_TUND) |
>                              MACB_BIT(ISR_RLE)  |
>                              MACB_BIT(TCOMP)    |
> +                            MACB_BIT(RM9200_TBRE)      |
>                              MACB_BIT(ISR_ROVR) |
>                              MACB_BIT(HRESP));
> 
> @@ -3925,6 +3926,7 @@ static void at91ether_stop(struct macb *lp)
>                              MACB_BIT(ISR_TUND) |
>                              MACB_BIT(ISR_RLE)  |
>                              MACB_BIT(TCOMP)    |
> +                            MACB_BIT(RM9200_TBRE)      |
>                              MACB_BIT(ISR_ROVR) |
>                              MACB_BIT(HRESP));
> 
> @@ -3994,11 +3996,10 @@ static netdev_tx_t at91ether_start_xmit(struct sk_buff *skb,
>                                         struct net_device *dev)
>  {
>         struct macb *lp = netdev_priv(dev);
> +       unsigned long flags;
> 
> -       if (macb_readl(lp, TSR) & MACB_BIT(RM9200_BNQ)) {
> -               int desc = 0;
> -
> -               netif_stop_queue(dev);
> +       if (lp->rm9200_tx_len < 2) {
> +               int desc = lp->rm9200_tx_tail;

I think you also want to protect these reads with spin_lock() to avoid
concurrency with the interrupt handler.

> 
>                 /* Store packet information (to free when Tx completed) */
>                 lp->rm9200_txq[desc].skb = skb;
> @@ -4012,6 +4013,15 @@ static netdev_tx_t at91ether_start_xmit(struct sk_buff *skb,
>                         return NETDEV_TX_OK;
>                 }
> 
> +               spin_lock_irqsave(&lp->lock, flags);
> +
> +               lp->rm9200_tx_tail = (desc + 1) & 1;
> +               lp->rm9200_tx_len++;
> +               if (lp->rm9200_tx_len > 1)
> +                       netif_stop_queue(dev);
> +
> +               spin_unlock_irqrestore(&lp->lock, flags);
> +
>                 /* Set address of the data in the Transmit Address register */
>                 macb_writel(lp, TAR, lp->rm9200_txq[desc].mapping);
>                 /* Set length of the packet in the Transmit Control register */
> @@ -4077,6 +4087,8 @@ static irqreturn_t at91ether_interrupt(int irq, void *dev_id)
>         struct macb *lp = netdev_priv(dev);
>         u32 intstatus, ctl;
>         unsigned int desc;
> +       unsigned int qlen;
> +       u32 tsr;
> 
>         /* MAC Interrupt Status register indicates what interrupts are pending.
>          * It is automatically cleared once read.
> @@ -4088,21 +4100,39 @@ static irqreturn_t at91ether_interrupt(int irq, void *dev_id)
>                 at91ether_rx(dev);
> 
>         /* Transmit complete */
> -       if (intstatus & MACB_BIT(TCOMP)) {
> +       if (intstatus & (MACB_BIT(TCOMP) | MACB_BIT(RM9200_TBRE))) {
>                 /* The TCOM bit is set even if the transmission failed */
>                 if (intstatus & (MACB_BIT(ISR_TUND) | MACB_BIT(ISR_RLE)))
>                         dev->stats.tx_errors++;
> 
> -               desc = 0;
> -               if (lp->rm9200_txq[desc].skb) {
> +               spin_lock(&lp->lock);

Also, this lock could be moved before while, below, as you want to protect
the rm9200_tx_len and rm9200_tx_tails members of lp as I understand.

Thank you,
Claudiu Beznea

> +
> +               tsr = macb_readl(lp, TSR);
> +
> +               /* we have three possibilities here:
> +                *   - all pending packets transmitted (TGO, implies BNQ)
> +                *   - only first packet transmitted (!TGO && BNQ)
> +                *   - two frames pending (!TGO && !BNQ)
> +                * Note that TGO ("transmit go") is called "IDLE" on RM9200.
> +                */
> +               qlen = (tsr & MACB_BIT(TGO)) ? 0 :
> +                       (tsr & MACB_BIT(RM9200_BNQ)) ? 1 : 2;
> +> +               while (lp->rm9200_tx_len > qlen) {
> +                       desc = (lp->rm9200_tx_tail - lp->rm9200_tx_len) & 1;
>                         dev_consume_skb_irq(lp->rm9200_txq[desc].skb);
>                         lp->rm9200_txq[desc].skb = NULL;
>                         dma_unmap_single(&lp->pdev->dev, lp->rm9200_txq[desc].mapping,
>                                          lp->rm9200_txq[desc].size, DMA_TO_DEVICE);
>                         dev->stats.tx_packets++;
>                         dev->stats.tx_bytes += lp->rm9200_txq[desc].size;
> +                       lp->rm9200_tx_len--;
>                 }
> -               netif_wake_queue(dev);
> +
> +               if (lp->rm9200_tx_len < 2 && netif_queue_stopped(dev))
> +                       netif_wake_queue(dev);
> +
> +               spin_unlock(&lp->lock);
>         }
> 
>         /* Work-around for EMAC Errata section 41.3.1 */
> --
> 2.28.0
> 

Powered by blists - more mailing lists