netdev - Re: [PATCH v4 1/1] net: ethernet: davinci_cpdma: Add boundary for rx and tx descriptors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <50F7C49B.5040304@ti.com>
Date:	Thu, 17 Jan 2013 15:00:03 +0530
From:	Mugunthan V N <mugunthanvnm@...com>
To:	Christian Riesch <christian.riesch@...cron.at>
CC:	<netdev@...r.kernel.org>, <davem@...emloft.net>,
	<s.hauer@...gutronix.de>
Subject: Re: [PATCH v4 1/1] net: ethernet: davinci_cpdma: Add boundary for
 rx and tx descriptors

Christian

On 1/17/2013 2:07 PM, Christian Riesch wrote:
> Hi,
>
> On 2013-01-17 01:10, Mugunthan V N wrote:
>> When there is heavy transmission traffic in the CPDMA, then Rx 
>> descriptors
>> memory is also utilized as tx desc memory looses all rx descriptors 
>> and the
>> driver stops working then.
>>
>> This patch adds boundary for tx and rx descriptors in bd ram dividing 
>> the
>> descriptor memory to ensure that during heavy transmission tx doesn't 
>> use
>> rx descriptors.
>>
>> This patch is already applied to davinci_emac driver, since CPSW and
>> davici_dmac shares the same CPDMA, moving the boundry seperation from
>> Davinci EMAC driver to CPDMA driver which was done in the following
>> commit
>>
>> commit 86d8c07ff2448eb4e860e50f34ef6ee78e45c40c
>> Author: Sascha Hauer <s.hauer@...gutronix.de>
>> Date:   Tue Jan 3 05:27:47 2012 +0000
>>
>>      net/davinci: do not use all descriptors for tx packets
>>
>>      The driver uses a shared pool for both rx and tx descriptors.
>>      During open it queues fixed number of 128 descriptors for receive
>>      packets. For each received packet it tries to queue another
>>      descriptor. If this fails the descriptor is lost for rx.
>>      The driver has no limitation on tx descriptors to use, so it
>>      can happen during a nmap / ping -f attack that the driver
>>      allocates all descriptors for tx and looses all rx descriptors.
>>      The driver stops working then.
>>      To fix this limit the number of tx descriptors used to half of
>>      the descriptors available, the rx path uses the other half.
>>
>>      Tested on a custom board using nmap / ping -f to the board from
>>      two different hosts.
>>
>> Signed-off-by: Mugunthan V N <mugunthanvnm@...com>
>> ---
>> Changes from v3
>> * Changed multiline commenting as per networking style
>>
>>   drivers/net/ethernet/ti/cpsw.c          |    9 ++++++
>>   drivers/net/ethernet/ti/davinci_cpdma.c |   51 
>> ++++++++++++++++++++++++++-----
>>   drivers/net/ethernet/ti/davinci_cpdma.h |    1 +
>>   drivers/net/ethernet/ti/davinci_emac.c  |   13 ++++----
>>   4 files changed, 61 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/ti/cpsw.c 
>> b/drivers/net/ethernet/ti/cpsw.c
>> index 3772804..b35e6a7 100644
>> --- a/drivers/net/ethernet/ti/cpsw.c
>> +++ b/drivers/net/ethernet/ti/cpsw.c
>> @@ -374,6 +374,9 @@ void cpsw_tx_handler(void *token, int len, int 
>> status)
>>       struct net_device    *ndev = skb->dev;
>>       struct cpsw_priv    *priv = netdev_priv(ndev);
>>
>> +    /* Check whether the queue is stopped due to stalled tx dma, if the
>> +     * queue is stopped then start the queue as we have free desc 
>> for tx
>> +     */
>>       if (unlikely(netif_queue_stopped(ndev)))
>>           netif_start_queue(ndev);
>>       cpts_tx_timestamp(&priv->cpts, skb);
>> @@ -736,6 +739,12 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct 
>> sk_buff *skb,
>>           goto fail;
>>       }
>>
>> +    /* If there is no more tx desc left free then we need to
>> +     * tell the kernel to stop sending us tx frames.
>> +     */
>> +    if (unlikely(cpdma_check_free_tx_desc(priv->txch)))
>> +        netif_stop_queue(ndev);
>> +
>>       return NETDEV_TX_OK;
>>   fail:
>>       priv->stats.tx_dropped++;
>> diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c 
>> b/drivers/net/ethernet/ti/davinci_cpdma.c
>> index 4995673..4f6d82f 100644
>> --- a/drivers/net/ethernet/ti/davinci_cpdma.c
>> +++ b/drivers/net/ethernet/ti/davinci_cpdma.c
>> @@ -105,13 +105,13 @@ struct cpdma_ctlr {
>>   };
>>
>>   struct cpdma_chan {
>> +    struct cpdma_desc __iomem    *head, *tail;
>> +    void __iomem            *hdp, *cp, *rxfree;
>>       enum cpdma_state        state;
>>       struct cpdma_ctlr        *ctlr;
>>       int                chan_num;
>>       spinlock_t            lock;
>> -    struct cpdma_desc __iomem    *head, *tail;
>>       int                count;
>> -    void __iomem            *hdp, *cp, *rxfree;
>
> I still do not see any benefit from this rearranging here.
This is just code cleanup and doesn't provide any benefit.
>
>>       u32                mask;
>>       cpdma_handler_fn        handler;
>>       enum dma_data_direction        dir;
>> @@ -217,17 +217,29 @@ desc_from_phys(struct cpdma_desc_pool *pool, 
>> dma_addr_t dma)
>>   }
>>
>>   static struct cpdma_desc __iomem *
>> -cpdma_desc_alloc(struct cpdma_desc_pool *pool, int num_desc)
>> +cpdma_desc_alloc(struct cpdma_desc_pool *pool, int num_desc, bool 
>> is_rx)
>>   {
>>       unsigned long flags;
>>       int index;
>> +    int desc_start;
>> +    int desc_end;
>>       struct cpdma_desc __iomem *desc = NULL;
>>
>>       spin_lock_irqsave(&pool->lock, flags);
>>
>> -    index = bitmap_find_next_zero_area(pool->bitmap, pool->num_desc, 0,
>> -                       num_desc, 0);
>> -    if (index < pool->num_desc) {
>> +    if (is_rx) {
>> +        desc_start = 0;
>> +        desc_end = pool->num_desc/2;
>> +        index = bitmap_find_next_zero_area(pool->bitmap,
>> +                pool->num_desc/2, 0, num_desc, 0);
>
> Duplicate calls of bitmap_find_next_zero_area, see a few lines below.
Will fix this in next patch version
>
>> +     } else {
>> +        desc_start = pool->num_desc/2;
>> +        desc_end = pool->num_desc;
>> +    }
>> +
>> +    index = bitmap_find_next_zero_area(pool->bitmap,
>> +                desc_end, desc_start, num_desc, 0);
>> +    if (index < desc_end) {
>>           bitmap_set(pool->bitmap, index, num_desc);
>>           desc = pool->iomap + pool->desc_size * index;
>>           pool->used_desc++;
>> @@ -660,6 +672,7 @@ int cpdma_chan_submit(struct cpdma_chan *chan, 
>> void *token, void *data,
>>       unsigned long            flags;
>>       u32                mode;
>>       int                ret = 0;
>> +    bool                            is_rx;
>>
>>       spin_lock_irqsave(&chan->lock, flags);
>>
>> @@ -668,7 +681,8 @@ int cpdma_chan_submit(struct cpdma_chan *chan, 
>> void *token, void *data,
>>           goto unlock_ret;
>>       }
>>
>> -    desc = cpdma_desc_alloc(ctlr->pool, 1);
>> +    is_rx = (chan->rxfree != 0);
>
> I guess you should use is_rx_chan() here.
Will change this to MACRO.
>
>> +    desc = cpdma_desc_alloc(ctlr->pool, 1, is_rx);
>>       if (!desc) {
>>           chan->stats.desc_alloc_fail++;
>>           ret = -ENOMEM;
>> @@ -704,6 +718,29 @@ unlock_ret:
>>   }
>>   EXPORT_SYMBOL_GPL(cpdma_chan_submit);
>>
>> +bool cpdma_check_free_tx_desc(struct cpdma_chan *chan)
>> +{
>> +    unsigned long flags;
>> +    int index;
>> +    bool ret;
>> +    struct cpdma_ctlr    *ctlr = chan->ctlr;
>> +    struct cpdma_desc_pool    *pool = ctlr->pool;
>> +
>> +    spin_lock_irqsave(&pool->lock, flags);
>> +
>> +    index = bitmap_find_next_zero_area(pool->bitmap,
>> +                pool->num_desc, pool->num_desc/2, 1, 0);
>
> I still wonder if using two separate descriptor pools for rx and tx 
> would be a nicer implementation. In this case you could just use 
> bitmap_full here, right? And the two pools would have separate 
> spinlocks, so rx could not lock tx and vice versa (but as I wrote in 
> an earlier email, I am a newbie here, so I don't know if this would be 
> beneficial).
>
I don't think having two separate pool with separate lock will improve 
performance, i have not seen any issues using one pool with one lock.

Regards
Mugunthan V N
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html