[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 04 Apr 2011 08:54:40 +0200
From: Nicolas de Pesloüan
<nicolas.2p.debian@...il.com>
To: Jesse Gross <jesse@...nel.org>
CC: Jiri Pirko <jpirko@...hat.com>, netdev@...r.kernel.org,
davem@...emloft.net, shemminger@...ux-foundation.org,
kaber@...sh.net, fubar@...ibm.com, eric.dumazet@...il.com,
andy@...yhouse.net, xiaosuo@...il.com,
"Eric W. Biederman" <ebiederm@...ssion.com>
Subject: Re: [patch net-next-2.6] net: vlan: make non-hw-accel rx path similar
to hw-accel
Le 03/04/2011 22:38, Jesse Gross a écrit :
> On Sun, Apr 3, 2011 at 8:23 AM, Nicolas de Pesloüan
> <nicolas.2p.debian@...il.com> wrote:
>> Le 02/04/2011 12:26, Jiri Pirko a écrit :
>>>
>>> Now there are 2 paths for rx vlan frames. When rx-vlan-hw-accel is
>>> enabled, skb is untagged by NIC, vlan_tci is set and the skb gets into
>>> vlan code in __netif_receive_skb - vlan_hwaccel_do_receive.
>>>
>>> For non-rx-vlan-hw-accel however, tagged skb goes thru whole
>>> __netif_receive_skb, it's untagged in ptype_base hander and reinjected
>>>
>>> This incosistency is fixed by this patch. Vlan untagging happens early in
>>> __netif_receive_skb so the rest of code (ptype_all handlers, rx_handlers)
>>> see the skb like it was untagged by hw.
>>>
>>> Signed-off-by: Jiri Pirko<jpirko@...hat.com>
>
> You saw Eric B.'s recent patch trying to tackle the same issues, right?:
> http://permalink.gmane.org/gmane.linux.network/190229
Yes, of course I saw it.
>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>> index 3da9fb0..bfe9fce 100644
>>> --- a/net/core/dev.c
>>> +++ b/net/core/dev.c
>>> @@ -3130,6 +3130,12 @@ another_round:
>>>
>>> __this_cpu_inc(softnet_data.processed);
>>>
>>> + if (skb->protocol == cpu_to_be16(ETH_P_8021Q)) {
>>> + skb = vlan_untag(skb);
>>> + if (unlikely(!skb))
>>> + goto out;
>>> + }
>>> +
>>
>> I like the general idea of this patch, but I don't like the idea of
>> re-inserting specific code inside __netif_receive_skb.
>>
>> You made a great work removing most - if not all - device specific parts
>> from __netif_receive_skb, by introducing rx_handler.
>>
>> I think the above part (and vlan_untag) should be moved to a vlan_rx_handler
>> that would be set on the net_devices that are the parent of a vlan
>> net_device and are NOT hwaccel.
>>
>> vlan_rx_handler would return RX_HANDLER_ANOTHER if skb holds a tagged frame
>> (skb->dev changed) and RX_HANDLER_PASS if skb holds an untagged frame
>> (skb->dev unchanged).
>
> It would be nice to merge all of this together. One complication is
> the interaction of bridging and vlan on the same device. Some people
> want to have a bridge for each vlan and a bridge for untagged packets.
> On older kernels with vlan accelerated hardware this was possible
> because vlan devices would get packets before bridging and on current
> kernels it is possible with ebtables rules. If we use rx_handler for
> both I believe we would need to extend it some to allow multiple
> handlers.
I totally agree.
Remember that Jiri's original proposal (last summer) was to have several rx_handlers per net_device.
I still think we need several of them, because the network stack need to be generic and allow for
any complex stacking setup. The rx_handler framework may need to be enhanced for that, but I think
it is the right tool to do all those per net_device specific features.
>> This would also cause protocol handlers to receive the untouched (tagged)
>> frame, if no setup required the frame to be untagged, which I think is the
>> right thing to do.
>
> At the very least we need to make sure that these packets are marked
> as PACKET_OTHERHOST because protocol handlers don't pay attention to
> the vlan field.
Agreed.
>>> @@ -3177,7 +3183,7 @@ ncls:
>>> ret = deliver_skb(skb, pt_prev, orig_dev);
>>> pt_prev = NULL;
>>> }
>>> - if (vlan_hwaccel_do_receive(&skb)) {
>>> + if (vlan_do_receive(&skb)) {
>>> ret = __netif_receive_skb(skb);
>>> goto out;
>>> } else if (unlikely(!skb))
>>
>> Why are you calling __netif_receive_skb here? Can't we simply goto
>> another_round?
>
> This code (other than the name change) predates the
> another_round/rx_handler changes.
Yes, you are right. Let's keep this for a possible follow-up patch, to avoid skb reinjection when it
is not strictly necessary.
Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists