lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 27 Aug 2014 21:26:27 -0700 (PDT)
From:	Tom Herbert <>
Subject: [PATCH v2 net-next 0/8] net: Checksum offload changes - Part VI

I am working on overhauling RX checksum offload. Goals of this effort

- Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
- Preserve CHECKSUM_COMPLETE through encapsulation layers
- Don't do skb_checksum more than once per packet
- Unify GRO and non-GRO csum verification as much as possible
- Unify the checksum functions (checksum_init)
- Simplify code

What is in this sixth patch set:

- Clarify the specific requirements of devices returning
  CHECKSUM_UNNECESSARY (comments in skbuff.h).
- Add csum_level field to skbuff. This is used to express how
  many checksums are covered by CHECKSUM_UNNECESSARY (stores n - 1).
- Change __skb_checksum_validate_needed to "consume" each checksum
  as indicated by csum_level as layers of the the packet are parsed.
- Remove skb_pop_rcv_encapsulation, no longer needed in the new
  csum_level model.
- Allow GRO path to "consume" checksums provided in CHECKSUM_UNNECESSARY
  and to report new verfied checksums for use in normal path fallback.
- Add proper support to SCTP to accept CHECKSUM_UNNECESSARY to validate
  header CRC.
- Modify drivers to set skb->csum_level instead of setting
  skb->encapsulation to indicate validation of an encapsulated
  checksum on receive.


Allocate a new 16 bits for flags in skbuff.

Please review carefully and test if possible, mucking with basic
checksum functions is always a little precarious :-)


Test results with this patch set are below. I did not see any
obvious performace regression.

Tests run:
   TCP_STREAM: super_netperf with 200 streams
   TCP_RR: super_netperf with 200 streams and -r 1,1

Device bnx2x (10Gbps):
   No GRE RSS hash (RX interrupts occur on one core)
   UDP RSS port hashing enabled.

* GRE with checksum with IPv4 encapsulated packets
  With fix:
        12.56% CPU utilization
        9341.21 Mbps
        90.96% CPU utilization
        155/230/367 90/95/99% latencies
        1.18032e+06 tps
  Without fix:
        12.09% CPU utilization
        9330.38 Mbps
        91.91% CPU utilization
        155/231/369 90/95/99% latencies
        1.17714e+06 tps

* GRE without checksum with IPv4 encapsulated packets
  With fix:
        18.53% CPU utilization
        9320.57 Mbps
        89.23% CPU utilization
        157/229/365 90/95/99% latencies
        1.17998e+06 tps
  Without fix:
        18.4% CPU utilization
        9240.72 Mbps
        91.61% CPU utilization
        158/235/370 90/95/99% latencies
        1.17375e+06 tps

* VXLAN with checksum
  With fix:
        19.90% CPU utilization
        9094.12 Mbps
        94.62% CPU utilization
        152/245/459 90/95/99% latencies
        1.18346e+06 tps
  Without fix:
        20.68% CPU utilization
        9175.63 Mbps
        95.15% CPU utilization
        151/243/459 90/95/99% latencies
        1.17244e+06 tps
* VXLAN with checksum
  With fix:
        23.97% CPU utilization
        9086.91 Mbps
        92.45% CPU utilization
        154/241/436 90/95/99% latencies
        1.17305e+06 tps
  Without fix:
        24.02% CPU utilization
        9084.82 Mbps
        94.1% CPU utilization
        154/244/449 90/95/99% latencies
        1.16107e+06 tps

To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to
More majordomo info at

Powered by blists - more mailing lists