[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ3xEMjMg9fZGk-=F_nvx2Ta5z8fj9nrEEAm+xWYSm-hyUDDYg@mail.gmail.com>
Date: Fri, 22 Jan 2016 00:45:13 +0200
From: Or Gerlitz <gerlitz.or@...il.com>
To: David Miller <davem@...emloft.net>
Cc: Jesper Dangaard Brouer <brouer@...hat.com>,
Tom Herbert <tom@...bertland.com>,
Eric Dumazet <eric.dumazet@...il.com>,
Eric Dumazet <edumazet@...gle.com>,
Linux Netdev List <netdev@...r.kernel.org>,
Alexander Duyck <alexander.duyck@...il.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Daniel Borkmann <borkmann@...earbox.net>,
Marek Majkowski <marek@...udflare.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Florian Westphal <fw@...len.de>,
Paolo Abeni <pabeni@...hat.com>,
John Fastabend <john.r.fastabend@...el.com>,
Amir Vadai <amirva@...il.com>,
Matan Barak <matanb@...lanox.com>
Subject: Re: Optimizing instruction-cache, more packets at each stage
On Thu, Jan 21, 2016 at 8:56 PM, David Miller <davem@...emloft.net> wrote:
> From: Or Gerlitz <gerlitz.or@...il.com>
> Date: Thu, 21 Jan 2016 14:49:25 +0200
>
>> On Thu, Jan 21, 2016 at 1:27 PM, Jesper Dangaard Brouer
>> <brouer@...hat.com> wrote:
>>> On Wed, 20 Jan 2016 15:27:38 -0800 Tom Herbert <tom@...bertland.com> wrote:
>>>
>>>> eth_type_trans touches headers
>>>
>>> True, the eth_type_trans() call in the driver is a major bottleneck,
>>> because it touch the packet header and happens very early in the driver.
>>>
>>> In my experiments, where I extract several packet before calling
>>> napi_gro_receive(), and I also delay calling eth_type_trans(). Most of
>>> my speedup comes from this trick, as the prefetch() now that enough
>>> time.
>>>
>>> while ((skb = __skb_dequeue(&rx_skb_list)) != NULL) {
>>> skb->protocol = eth_type_trans(skb, rq->netdev);
>>> napi_gro_receive(cq->napi, skb);
>>> }
>>>
>>> What is the HW could provide the info we need in the descriptor?!?
>>>
>>>
>>> eth_type_trans() does two things:
>>>
>>> 1) determine skb->protocol
>>> 2) setup skb->pkt_type = PACKET_{BROADCAST,MULTICAST,OTHERHOST}
>>>
>>> Could the HW descriptor deliver the "proto", or perhaps just some bits
>>> on the most common proto's?
>>>
>>> The skb->pkt_type don't need many bits. And I bet the HW already have
>>> the information. The BROADCAST and MULTICAST indication are easy. The
>>> PACKET_OTHERHOST, can be turned around, by instead set a PACKET_HOST
>>> indication, if the eth->h_dest match the devices dev->dev_addr (else a
>>> SW compare is required).
>>>
>>> Is that doable in hardware?
>>
>> As I wrote earlier, for determination of the eth-type HWs can do what you ask
>> here and more.
>>
>> Protocol being IP or not (and only then you look in the data) you could
>> get I guess from many NICs, e.g if the NIC sets PKT_HASH_TYPE_L4
>> or PKT_HASH_TYPE_L3 then we know it's an IP packets and only if
>> we don't see this indication we look into the data.
>
> This doesn't differentiate ipv4 vs. ipv6 which is critical here, so this
> mechanism is not sufficient.
Dave, at least in the ConnectX4 (mlx5e driver), as I commented earlier
on this thread, we can use programmed tags reported by the HW on the
completion of packets whether the ethtype is ipv4 or ipv6 or
something else, and let the kernel
branch look into the packet memory on in the last case.
> We must know the exact ETH_P_* value.
Powered by blists - more mailing lists