netdev - Re: [net-next RFC PATCH 5/5] virtio-net: flow director support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4EDDC35C.2070100@redhat.com>
Date:	Tue, 06 Dec 2011 15:25:16 +0800
From:	Jason Wang <jasowang@...hat.com>
To:	Ben Hutchings <bhutchings@...arflare.com>
CC:	krkumar2@...ibm.com, kvm@...r.kernel.org, mst@...hat.com,
	netdev@...r.kernel.org, rusty@...tcorp.com.au,
	virtualization@...ts.linux-foundation.org, levinsasha928@...il.com
Subject: Re: [net-next RFC PATCH 5/5] virtio-net: flow director support

On 12/06/2011 04:42 AM, Ben Hutchings wrote:
> On Mon, 2011-12-05 at 16:59 +0800, Jason Wang wrote:
>> In order to let the packets of a flow to be passed to the desired
>> guest cpu, we can co-operate with devices through programming the flow
>> director which was just a hash to queue table.
>>
>> This kinds of co-operation is done through the accelerate RFS support,
>> a device specific flow sterring method virtnet_fd() is used to modify
>> the flow director based on rfs mapping. The desired queue were
>> calculated through reverse mapping of the irq affinity table. In order
>> to parallelize the ingress path, irq affinity of rx queue were also
>> provides by the driver.
>>
>> In addition to accelerate RFS, we can also use the guest scheduler to
>> balance the load of TX and reduce the lock contention on egress path,
>> so the processor_id() were used to tx queue selection.
> [...]
>> +#ifdef CONFIG_RFS_ACCEL
>> +
>> +int virtnet_fd(struct net_device *net_dev, const struct sk_buff *skb,
>> +	       u16 rxq_index, u32 flow_id)
>> +{
>> +	struct virtnet_info *vi = netdev_priv(net_dev);
>> +	u16 *table = NULL;
>> +
>> +	if (skb->protocol != htons(ETH_P_IP) || !skb->rxhash)
>> +		return -EPROTONOSUPPORT;
> Why only IPv4?

Oops, IPv6 should work also.
>> +	table = kmap_atomic(vi->fd_page);
>> +	table[skb->rxhash&  TAP_HASH_MASK] = rxq_index;
>> +	kunmap_atomic(table);
>> +
>> +	return 0;
>> +}
>> +#endif
> This is not a proper implementation of ndo_rx_flow_steer.  If you steer
> a flow by changing the RSS table this can easily cause packet reordering
> in other flows.  The filtering should be more precise, ideally matching
> exactly a single flow by e.g. VID and IP 5-tuple.
>
> I think you need to add a second hash table which records exactly which
> flow is supposed to be steered.  Also, you must call
> rps_may_expire_flow() to check whether an entry in this table may be
> replaced; otherwise you can cause packet reordering in the flow that was
> previously being steered.
>
> Finally, this function must return the table index it assigned, so that
> rps_may_expire_flow() works.

Thanks for the explanation, how about document this briefly in scaling.txt?
>> +static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>> +{
>> +	int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>> +					       smp_processor_id();
>> +
>> +	/* As we make use of the accelerate rfs which let the scheduler to
>> +	 * balance the load, it make sense to choose the tx queue also based on
>> +	 * theprocessor id?
>> +	 */
>> +	while (unlikely(txq>= dev->real_num_tx_queues))
>> +		txq -= dev->real_num_tx_queues;
>> +	return txq;
>> +}
> [...]
>
> Don't do this, let XPS handle it.
>
> Ben.
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html