lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 13 Nov 2020 16:04:59 +0100
From:   Kristian Evensen <kristian.evensen@...il.com>
To:     Carl Yin(殷张成) <carl.yin@...ctel.com>
Cc:     Daniele Palmas <dnlplm@...il.com>,
        Bjørn Mork <bjorn@...k.no>,
        Paul Gildea <paul.gildea@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Network Development <netdev@...r.kernel.org>,
        linux-usb <linux-usb@...r.kernel.org>
Subject: Re: [PATCH net-next 1/1] net: usb: qmi_wwan: add default rx_urb_size

Hi,

On Fri, Nov 13, 2020 at 9:50 AM Kristian Evensen
<kristian.evensen@...il.com> wrote:
> Yes, you are right in that NAT can have a large effect on performance,
> especially when you start being CPU-limited. However,when using perf
> to profile the kernel during my tests, no function related to
> netfilter/conntrack appeared very high on the list. I would also
> expect the modem to at least reach the performance of the dongle, with
> offloading being switched off. However, there could be some detail I
> missed.

I continued working on this issue today and I believe I have found at
least one reason for my performance problems. My initial attempts at
profiling resulted in quite noisy perf files and this caused me to
look in the wrong places. Today I figured out how to get a cleaner
file, and I noticed that a lot of resources were spent on
pskb_expand_head() + support functions.

My MT7621 devices are used as routers, so before the packets are sent
out on the LAN additional headers have to be added. The current code
in qmimux_rx_fixup() allocates an SKB for each aggregated packet and
copies the data from the URB. The newly allocated SKB has too little
headroom, so when we get to ip_forward() then the check in skb_cow()
fails and the SKB is reallocated. After increasing the amount of data
allocated to also include the required headroom + reserving headroom
amount of bytes, I see a huge performance increase. I go from around
230 Mbit/s and to 280Mbit/s, with significantly less CPU usage. 280
Mbit/s is the same speed as I get from my phone connected to the same
network, so it seems to be the max of the network right now.

I do not know what would be an acceptable way (if any) to get this fix
upstreamed. I currently add an additional "safe" amount of data, but I
am pretty sure ETH_HLEN + 2 is not an acceptable solution :)

Kristian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ