[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <e28faa37-549d-4c49-824f-1d0dfbfb9538@yandex.pl>
Date: Mon, 23 Oct 2023 15:59:09 +0200
From: Michal Soltys <msoltyspl@...dex.pl>
To: netdev@...r.kernel.org
Cc: RafaĆ Golcz <rgl@...k.pl>, Piotr Przybylski <ppr@...k.pl>
Subject: [QUESTION] potential issue - unusual drops on XL710 (40gbit) cards
with ksoftirqd hogging one of cpus near 100%
Hi,
A while ago we have noticed some unusual RX drops during more busy day
periods (but nowhere near hitting any hardware limits) on our production
edge servers. More details on their usage below.
First the hardware in question:
"older" servers:
Huawei FusionServer RH1288 V3 / 40x Intel(R) Xeon(R) CPU E5-2640 v4
"newer" servers:
Huawei FusionServer Pro 1288H V5 / 40x Intel(R) Xeon(R) Gold 5115
In both cases the servers have 512 GB ram and are using two XL710 40GbE
cards in 802.3ad bond (the traffic is very well spread out).
Network card details:
Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
Driver info as reported by ethtool same for both types:
driver: i40e
firmware-version: 8.60 0x8000bd5f 1.3140.0
or
firmware-version: 8.60 0x8000bd85 1.3140.0
These are running under Ubuntu 20.04.6 LTS server with 5.15 kernels
(although they differ by minor versions, the issue by now happened on
most of those).
The servers are doing content delivery work, mostly sending the data,
primarily from the page cache. At the busiest periods the traffic
approaches roughly ~50gbit per server across those 2 bonded network
cards (outbound traffic). Inbound traffic in comparison is a fraction of
that, reaching maybe 1gbit on average.
The traffic is handled via Open Resty (nginx) with additional tr/edge
logic coded in lua. When everything is fine, we have:
- outbound 30-50gbit spread across both NICs
- inbound 500mbit-1gbit
- NET_RX softirqs averaging ~20k/s per cpu
- NET_TX softirqs averaging 5-10/s per cpu
- no packet drops
- cpu usage around ~10%-20% per core
- ram used by nginx processes and the rest of the system up to around 15g
- the rest of the ram in practice used as a page cache
Sometimes (once per few days, on random of those servers) we have weird
anomaly happening during the busy hours:
- lasts around 10-15 minutes, starts suddenly and ends suddenly as well
- on one of the cpus we get the following anomalies:
- NET_RX softirqs drop to ~1k/s
- NET_TX softirqs rise to ~500-1k/s
- ksoftirqd hogs that particular cpu at >90% usage
- significant packet drop on the inbound side - roughly around 10-20%
incoming packets
- lots of nginx context switches
- aggressively reclaimed page cache - up to ~200 GB memory is reclaimed
and immediately start filling up again with the data normally served by
those servers
- the actual memory used by nginx/userland rises a tiny bit by ~1 GB
while that happens
From things we know:
- none of the network cards ever reach their theoretical capability, as
the traffic is well spread across them - when the issues happen it's
around 20-25gbit/card
- we are not saturating inter-socket QPI links
- this happens and stops happening pretty much suddenly
- the TX side remains w/o drop issues
- this has been happening since the december 2022, but it's hard to
pinpoint the reason at this moment
- we have system-wide perf dumps from the period when it happens (see
the link at the end)
Sorry for a bit chaotic writeup. At this point we are a bit out of ideas
how to debug it further (and what data to provide to pinpoint the issue).
- is it perhaps a known issue with kernels around 5.15 and/or these
network cards and/or their drivers ?
- any pointers what else (besides kernel/xl710/driver) could be an issue ?
- any ideas how to debug it further
- we have system-wide perf dumps from the period when it happens, if
that would be useful for further analysis; any assistance would be
greately appreciated
Link to aforementioned perf dump:
https://drive.google.com/file/d/11qFgRP-r03Oj42V_fAgQBp2ebJ1d4YBW/view
From the quick check it looks like we spend a lot of time in RX path in
__tcp_push_pending_frames()
Powered by blists - more mailing lists