[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1357845142.2712.11.camel@bwh-desktop.uk.solarflarecom.com>
Date: Thu, 10 Jan 2013 19:12:22 +0000
From: Ben Hutchings <bhutchings@...arflare.com>
To: Rusty Russell <rusty@...tcorp.com.au>
CC: Wanlong Gao <gaowanlong@...fujitsu.com>,
<linux-kernel@...r.kernel.org>,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Eric Dumazet <erdnetdev@...il.com>,
<virtualization@...ts.linux-foundation.org>,
<netdev@...r.kernel.org>
Subject: Re: [PATCH V3 1/2] virtio-net: fix the set affinity bug when CPU
IDs are not consecutive
On Thu, 2013-01-10 at 11:19 +1030, Rusty Russell wrote:
> Wanlong Gao <gaowanlong@...fujitsu.com> writes:
> > On 01/09/2013 07:31 AM, Rusty Russell wrote:
> >> Wanlong Gao <gaowanlong@...fujitsu.com> writes:
> >>> */
> >>> static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
> >>> {
> >>> - int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
> >>> - smp_processor_id();
> >>> + int txq = 0;
> >>> +
> >>> + if (skb_rx_queue_recorded(skb))
> >>> + txq = skb_get_rx_queue(skb);
> >>> + else if ((txq = per_cpu(vq_index, smp_processor_id())) == -1)
> >>> + txq = 0;
> >>
> >> You should use __get_cpu_var() instead of smp_processor_id() here, ie:
> >>
> >> else if ((txq = __get_cpu_var(vq_index)) == -1)
> >>
> >> And AFAICT, no reason to initialize txq to 0 to start with.
> >>
> >> So:
> >>
> >> int txq;
> >>
> >> if (skb_rx_queue_recorded(skb))
> >> txq = skb_get_rx_queue(skb);
> >> else {
> >> txq = __get_cpu_var(vq_index);
> >> if (txq == -1)
> >> txq = 0;
> >> }
> >
> > Got it, thank you.
> >
> >>
> >> Now, just to confirm, I assume this can happen even if we use vq_index,
> >> right, because of races with virtnet_set_channels?
> >
> > I still can't understand this race, could you explain more? thank you.
>
> I assume that someone can call virtnet_set_channels() while we are
> inside virtnet_select_queue(), so they reduce dev->real_num_tx_queues,
> causing virtnet_set_channels to do:
>
> while (unlikely(txq >= dev->real_num_tx_queues))
> txq -= dev->real_num_tx_queues;
>
> Otherwise, when is this loop called?
In fact, this race can result in the TX scheduler using a queue that has
been disabled, or other weirdness (consider what happens if
real_num_tx_queues increases between those two uses).
virtnet_set_channels() really must disable TX temporarily:
netif_tx_lock(dev);
netif_device_detach(dev);
netif_tx_unlock(dev);
...
netif_device_attach(dev);
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists