[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ADE2A24.6080300@gmail.com>
Date: Tue, 20 Oct 2009 23:22:44 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Ben Greear <greearb@...delatech.com>
CC: NetDev <netdev@...r.kernel.org>, robert@...julf.net,
"David S. Miller" <davem@...emloft.net>
Subject: Re: pktgen and spin_lock_bh in xmit path
Ben Greear a écrit :
> On 10/20/2009 11:54 AM, Eric Dumazet wrote:
>
>> - queue_map = skb_get_queue_mapping(pkt_dev->skb);
>> + queue_map = pkt_dev->cur_queue_map;
>> + /*
>> + * tells skb_tx_hash() to use this tx queue.
>> + * We should reset skb->mapping before each xmit() because
>> + * xmit() might change it.
>> + */
>> + skb_record_rx_queue(pkt_dev->skb, queue_map);
>> txq = netdev_get_tx_queue(odev, queue_map);
>
> I think that must be wrong. The record_rx_queue sets it to queue_map + 1,
> but the hard-start-xmit method (in ixgbe/ixgbe_main.c, at least), takes the
> skb->queue_map and uses it as an index with no subtraction.
Yes but check net/core/dev.c I quoted in my previous mail :
We change queue_map if skb goes through dev_queue_xmit()
(as done by macvlan)
u16 skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb)
{
u32 hash;
if (skb_rx_queue_recorded(skb)) {
hash = skb_get_rx_queue(skb);
while (unlikely(hash >= dev->real_num_tx_queues))
hash -= dev->real_num_tx_queues;
return hash;
}
if (skb->sk && skb->sk->sk_hash)
hash = skb->sk->sk_hash;
else
hash = skb->protocol;
hash = jhash_1word(hash, skb_tx_hashrnd);
return (u16) (((u64) hash * dev->real_num_tx_queues) >> 32);
}
EXPORT_SYMBOL(skb_tx_hash);
static struct netdev_queue *dev_pick_tx(struct net_device *dev,
struct sk_buff *skb)
{
const struct net_device_ops *ops = dev->netdev_ops;
u16 queue_index = 0;
if (ops->ndo_select_queue)
queue_index = ops->ndo_select_queue(dev, skb);
else if (dev->real_num_tx_queues > 1)
queue_index = skb_tx_hash(dev, skb);
skb_set_queue_mapping(skb, queue_index);
return netdev_get_tx_queue(dev, queue_index);
}
So if skb->queue_mapping was X+1 before entering dev_pick_tx(), it is X when
leaving dev_pick_tx()
>
> This causes watchdog timeouts because we are calling txq_trans_update in
> pktgen on
> queue 0, for instance, but sending pkts on queue 1. If queue 1 is ever
> busy when the WD fires, link is reset.
>
Problem is not pktgen IMHO.
Problem is skb->queue_mapping has different meaning if skb is directly given to a real device -> start_xmit()
( In this case skb->queue_mapping should be between [O ... real_num_tx_queues-1])
But if it goes through dev_queue_xmit(), it should be set between [1 .. real_num_tx_queues], because
dev_pick_tx() will decrement skb->queue_mapping
In fact skb->queue_mapping only works for forwarded packets, not locally generated ones.
I am too tired to cook a fix at this moment, sorry :(
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists