netdev - Re: [RFC] Kernel unaligned access at __skb_flow

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALx6S37+xhG-CYDUYqzXcS3FBW+Ta_kjQv6CgYKA2XdQLpHsfQ@mail.gmail.com>
Date:	Sun, 31 Jan 2016 14:13:49 -0800
From:	Tom Herbert <tom@...bertland.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Sowmini Varadhan <sowmini.varadhan@...cle.com>,
	Michael Dalton <mwdalton@...gle.com>,
	Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [RFC] Kernel unaligned access at __skb_flow_dissect

On Fri, Jan 29, 2016 at 6:15 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Fri, 2016-01-29 at 16:47 -0800, Tom Herbert wrote:
>
>> Hmm, thanks for pointing that out. It looks like this is called by a
>> couple drivers as part of pulling up the data to get alignment. I am
>> actually surprised they are doing this. Flow dissector is not the
>> cheapest function in the world and it seems like a shame to be calling
>> it this early and not even getting a hash for the skb. Also, this is
>> another instance of touching the data so if we do get to the point of
>> trying steering packets without cache miss (another thread); this
>> undermines that. Seems like it might be just as well to pull up a
>> fixed number of bytes (ixgbe want min of 60 bytes), or don't pull up
>> anything avoid the cache miss here and let the stack pull up as
>> needed.
>
> Except that when for example mlx4 is able to pull only the headers
> instead of a fixed amount of bytes, we increased the performance,
> because GRO could cook packets twice as big for tunneled traffic.
>
> ( commit cfecec56ae7c7c40f23fbdac04acee027ca3bd66 )
>
> 1) Increasing performance by prefetching headers is a separate matter,
> completely orthogonal to this.
>
> 2) Taking the opportunity to also give back the l4 hash as a bonus
> has been considered, but tests showed no clear benefit.
>
>   For typical cases where l4 hash is not used (RPS and RFS being not
> enabled by default on linux), computing it brings nothing but added
> complexity.
>
Neither is GRE enabled by default in Linux and it is not a typical
case. So that patch is an optimization for a very narrow use case that
impacts the core data path for everyone. Please at least consider
making it configurable.

>
>