lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 7 Aug 2020 17:21:51 -0600
From:   Ryan Cox <ryan_cox@....edu>
To:     Scott Dial <scott@...ttdial.com>
Cc:     Antoine Tenart <antoine.tenart@...tlin.com>,
        netdev@...r.kernel.org, davem@...emloft.net, sd@...asysnail.net
Subject: Re: Severe performance regression in "net: macsec: preserve ingress
 frame ordering"

On 8/6/20 9:48 PM, Scott Dial wrote:
> The aes-aesni driver is smart enough to use the FPU if it's not busy and
> fallback to the CPU otherwise. Unfortunately, the ghash-clmulni driver
> does not have that kind of logic in it and only provides an async version,
> so we are forced to use the ghash-generic implementation, which is a pure
> CPU implementation. The ideal would be for aesni_intel to provide a
> synchronous version of gcm(aes) that fell back to the CPU if the FPU is
> busy.

I don't know how the AES-NI support works, but I did see your specific 
mention of aesni_intel and figured I should mention that this does also 
affect AMD. I just got access to AMD nodes (2 x EPYC 7302) with a 
Mellanox 10 GbE NIC.  I did the same test and it had a similar 
performance pattern.  I doubt this means much but I figured I should 
mention it.

> I don't know if the crypto maintainers would be open to such a change, but
> if the choice was between reverting and patching the crypto code, then I
> would work on patching the crypto code.

I can't opine on anything crypto-related since it is extremely way 
outside of my area of expertise, though it is helpful to hear what is 
going on.

> In any case, you didn't report how many packets arrived out of order, which
> was the issue being addressed by my change. It would be helpful to get
> the output of "ip -s macsec show" and specifically the InPktsDelayed
> counter. Did iperf3 report out-of-order packets with the patch reverted?
> Otherwise, if this is the only process running on your test servers,
> then you may not be generating any contention for the FPU, which is the
> source of the out-of-order issue. Maybe you could run prime95 to busy
> the FPU to see the issue that I was seeing.

I ran some tests again on the same servers as before with the Intel 
NICs.  I tested with prime95 running on 27 of the 28 cores in *each* 
server simultaneously (allowing iperf3 to use a core on each) throughout 
the entire test.  This was using 5.7.11 with 
ab046a5d4be4c90a3952a0eae75617b49c0cb01b reverted, so pre-5.7 performance.

MACsec interfaces are deleted and recreated before each test, so 
counters are always fresh.

== MACSEC WITHOUT ENCRYPTION ==

* Server1:
18: ms1: protect on validate strict sc off sa off encrypt off send_sci 
on end_station off scb off replay off
     cipher suite: GCM-AES-128, using ICV length 16
     TXSC: 0000000000001234 on SA 0
     stats: OutPktsUntagged InPktsUntagged OutPktsTooLong InPktsNoTag 
InPktsBadTag InPktsUnknownSCI InPktsNoSCI InPktsOverrun
                          0              0              0 
1123            0                0           1             0
     stats: OutPktsProtected OutPktsEncrypted OutOctetsProtected 
OutOctetsEncrypted
                     3798421                0 30889802591                  0
         0: PN 3799655, state on, key 01000000000000000000000000000000
     stats: OutPktsProtected OutPktsEncrypted
                     3798421                0
     RXSC: 0000000000001234, state on
     stats: InOctetsValidated InOctetsDecrypted InPktsUnchecked 
InPktsDelayed InPktsOK InPktsInvalid InPktsLate InPktsNotValid 
InPktsNotUsingSA InPktsUnusedSA
                  30042694872                 0 0           218  
3675170             0          0 0                0              0
         0: PN 3676633, state on, key 01000000000000000000000000000000
     stats: InPktsOK InPktsInvalid InPktsNotValid InPktsNotUsingSA 
InPktsUnusedSA
             3675170             0              0 0              0

*Server2:
18: ms1: protect on validate strict sc off sa off encrypt off send_sci 
on end_station off scb off replay off
     cipher suite: GCM-AES-128, using ICV length 16
     TXSC: 0000000000001234 on SA 0
     stats: OutPktsUntagged InPktsUntagged OutPktsTooLong InPktsNoTag 
InPktsBadTag InPktsUnknownSCI InPktsNoSCI InPktsOverrun
                          0              0              0 
1227            0                0           1             0
     stats: OutPktsProtected OutPktsEncrypted OutOctetsProtected 
OutOctetsEncrypted
                     3675399                0 30042696158                  0
         0: PN 3676633, state on, key 01000000000000000000000000000000
     stats: OutPktsProtected OutPktsEncrypted
                     3675399                0
     RXSC: 0000000000001234, state on
     stats: InOctetsValidated InOctetsDecrypted InPktsUnchecked 
InPktsDelayed InPktsOK InPktsInvalid InPktsLate InPktsNotValid 
InPktsNotUsingSA InPktsUnusedSA
                  30889801305                 0 0             0  
3798410             0          0 0                0              0
         0: PN 3799655, state on, key 01000000000000000000000000000000
     stats: InPktsOK InPktsInvalid InPktsNotValid InPktsNotUsingSA 
InPktsUnusedSA
             3798410             0              0 0              0


InPktsDelayed was 218 for Server1 and 0 for Server2.

== MACSEC WITH ENCRYPTION ==

I got the following *with* encryption (macsec interface deleted and 
recreated before the test, so counters are fresh):
*Server1:
19: ms1: protect on validate strict sc off sa off encrypt on send_sci on 
end_station off scb off replay off
     cipher suite: GCM-AES-128, using ICV length 16
     TXSC: 0000000000001234 on SA 0
     stats: OutPktsUntagged InPktsUntagged OutPktsTooLong InPktsNoTag 
InPktsBadTag InPktsUnknownSCI InPktsNoSCI InPktsOverrun
                          0              0              0 
1397            0                0           0             0
     stats: OutPktsProtected OutPktsEncrypted OutOctetsProtected 
OutOctetsEncrypted
                           0          5560714 0        46931594623
         0: PN 5561948, state on, key 01000000000000000000000000000000
     stats: OutPktsProtected OutPktsEncrypted
                           0          5560714
     RXSC: 0000000000001234, state on
     stats: InOctetsValidated InOctetsDecrypted InPktsUnchecked 
InPktsDelayed InPktsOK InPktsInvalid InPktsLate InPktsNotValid 
InPktsNotUsingSA InPktsUnusedSA
                            0       45977049585 0          3771  
5417843             0          0 0                0              0
         0: PN 5422860, state on, key 01000000000000000000000000000000
     stats: InPktsOK InPktsInvalid InPktsNotValid InPktsNotUsingSA 
InPktsUnusedSA
             5417843             0              0 0              0

*Server2:
19: ms1: protect on validate strict sc off sa off encrypt on send_sci on 
end_station off scb off replay off
     cipher suite: GCM-AES-128, using ICV length 16
     TXSC: 0000000000001234 on SA 0
     stats: OutPktsUntagged InPktsUntagged OutPktsTooLong InPktsNoTag 
InPktsBadTag InPktsUnknownSCI InPktsNoSCI InPktsOverrun
                          0              0              0 
1490            0                0           0             0
     stats: OutPktsProtected OutPktsEncrypted OutOctetsProtected 
OutOctetsEncrypted
                           0          5421626 0        45977059885
         0: PN 5422860, state on, key 01000000000000000000000000000000
     stats: OutPktsProtected OutPktsEncrypted
                           0          5421626
     RXSC: 0000000000001234, state on
     stats: InOctetsValidated InOctetsDecrypted InPktsUnchecked 
InPktsDelayed InPktsOK InPktsInvalid InPktsLate InPktsNotValid 
InPktsNotUsingSA InPktsUnusedSA
                            0       46931106683 0           109  
5560541             0          0 0                0              0
         0: PN 5561948, state on, key 01000000000000000000000000000000
     stats: InPktsOK InPktsInvalid InPktsNotValid InPktsNotUsingSA 
InPktsUnusedSA
             5560541             0              0 0              0


InPktsDelayed was 3771 for Server1 and 109 for Server2.


The performance numbers were:
* 9.87 Gb/s without macsec
* 6.00 Gb/s with macsec WITHOUT encryption
* 9.19 Gb/s with macsec WITH encryption

iperf3 retransmits were:
* 27 without macsec
* 1211 with macsec WITHOUT encryption
* 721 with macsec WITH encryption


Thanks for the reply and for the background on this.

Ryan

Powered by blists - more mailing lists