lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 5 May 2022 18:04:44 +0200
From:   Andrew Lunn <andrew@...n.ch>
To:     Rafał Miłecki <zajec5@...il.com>
Cc:     Arnd Bergmann <arnd@...db.de>,
        Alexander Lobakin <alexandr.lobakin@...el.com>,
        Network Development <netdev@...r.kernel.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        Russell King <linux@...linux.org.uk>,
        Felix Fietkau <nbd@....name>,
        "openwrt-devel@...ts.openwrt.org" <openwrt-devel@...ts.openwrt.org>,
        Florian Fainelli <f.fainelli@...il.com>
Subject: Re: Optimizing kernel compilation / alignments for network
 performance

> you'll see that most used functions are:
> v7_dma_inv_range
> __irqentry_text_end
> l2c210_inv_range
> v7_dma_clean_range
> bcma_host_soc_read32
> __netif_receive_skb_core
> arch_cpu_idle
> l2c210_clean_range
> fib_table_lookup

There is a lot of cache management functions here. Might sound odd,
but have you tried disabling SMP? These cache functions need to
operate across all CPUs, and the communication between CPUs can slow
them down. If there is only one CPU, these cache functions get simpler
and faster.

It just depends on your workload. If you have 1 CPU loaded to 100% and
the other 3 idle, you might see an improvement. If you actually need
more than one CPU, it will probably be worse.

I've also found that some Ethernet drivers invalidate or flush too
much. If you are sending a 64 byte TCP ACK, all you need to flush is
64 bytes, not the full 1500 MTU. If you receive a TCP ACK, and then
recycle the buffer, all you need to invalidate is the size of the ACK,
so long as you can guarantee nothing has touched the memory above it.
But you need to be careful when implementing tricks like this, or you
can get subtle corruption bugs when you get it wrong.

    Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ