netdev - Re: [PATCH v3 net-next 0/3] net: batched receive in GRO path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20190808.182259.1921801896274965443.davem@davemloft.net>
Date:   Thu, 08 Aug 2019 18:22:59 -0700 (PDT)
From:   David Miller <davem@...emloft.net>
To:     ecree@...arflare.com
Cc:     netdev@...r.kernel.org, eric.dumazet@...il.com,
        linux-net-drivers@...arflare.com
Subject: Re: [PATCH v3 net-next 0/3] net: batched receive in GRO path

From: Edward Cree <ecree@...arflare.com>
Date: Tue, 6 Aug 2019 14:52:06 +0100

> This series listifies part of GRO processing, in a manner which allows those
>  packets which are not GROed (i.e. for which dev_gro_receive returns
>  GRO_NORMAL) to be passed on to the listified regular receive path.
> dev_gro_receive() itself is not listified, nor the per-protocol GRO
>  callback, since GRO's need to hold packets on lists under napi->gro_hash
>  makes keeping the packets on other lists awkward, and since the GRO control
>  block state of held skbs can refer only to one 'new' skb at a time.
> Instead, when napi_frags_finish() handles a GRO_NORMAL result, stash the skb
>  onto a list in the napi struct, which is received at the end of the napi
>  poll or when its length exceeds the (new) sysctl net.core.gro_normal_batch.
> 
> Performance figures with this series, collected on a back-to-back pair of
>  Solarflare sfn8522-r2 NICs with 120-second NetPerf tests.  In the stats,
>  sample size n for old and new code is 6 runs each; p is from a Welch t-test.
> Tests were run both with GRO enabled and disabled, the latter simulating
>  uncoalesceable packets (e.g. due to IP or TCP options).  The receive side
>  (which was the device under test) had the NetPerf process pinned to one CPU,
>  and the device interrupts pinned to a second CPU.  CPU utilisation figures
>  (used in cases of line-rate performance) are summed across all CPUs.
> net.core.gro_normal_batch was left at its default value of 8.
 ...
> The above results are fairly mixed, and in most cases not statistically
>  significant.  But I think we can roughly conclude that the series
>  marginally improves non-GROable throughput, without hurting latency
>  (except in the large-payload busy-polling case, which in any case yields
>  horrid performance even on net-next (almost triple the latency without
>  busy-poll).  Also, drivers which, unlike sfc, pass UDP traffic to GRO
>  would expect to see a benefit from gaining access to batching.
> 
> Changed in v3:
>  * gro_normal_batch sysctl now uses SYSCTL_ONE instead of &one
>  * removed RFC tags (no comments after a week means no-one objects, right?)
> 
> Changed in v2:
>  * During busy poll, call gro_normal_list() to receive batched packets
>    after each cycle of the napi busy loop.  See comments in Patch #3 for
>    complications of doing the same in busy_poll_stop().
> 
> [1]: Cohen 1959, doi: 10.1080/00401706.1959.10489859

Series applied, thanks Edward.