[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230508020801.10702-2-cathy.zhang@intel.com>
Date: Sun, 7 May 2023 19:08:00 -0700
From: Cathy Zhang <cathy.zhang@...el.com>
To: edumazet@...gle.com,
davem@...emloft.net,
kuba@...nel.org,
pabeni@...hat.com
Cc: jesse.brandeburg@...el.com,
suresh.srinivas@...el.com,
tim.c.chen@...el.com,
lizhen.you@...el.com,
cathy.zhang@...el.com,
eric.dumazet@...il.com,
netdev@...r.kernel.org
Subject: [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a proper size
Before commit 4890b686f408 ("net: keep sk->sk_forward_alloc as small as
possible"), each TCP can forward allocate up to 2 MB of memory and
tcp_memory_allocated might hit tcp memory limitation quite soon. To
reduce the memory pressure, that commit keeps sk->sk_forward_alloc as
small as possible, which will be less than 1 page size if SO_RESERVE_MEM
is not specified.
However, with commit 4890b686f408 ("net: keep sk->sk_forward_alloc as
small as possible"), memcg charge hot paths are observed while system is
stressed with a large amount of connections. That is because
sk->sk_forward_alloc is too small and it's always less than
sk->truesize, network handlers like tcp_rcv_established() should jump to
slow path more frequently to increase sk->sk_forward_alloc. Each memory
allocation will trigger memcg charge, then perf top shows the following
contention paths on the busy system.
16.77% [kernel] [k] page_counter_try_charge
16.56% [kernel] [k] page_counter_cancel
15.65% [kernel] [k] try_charge_memcg
In order to avoid the memcg overhead and performance penalty,
sk->sk_forward_alloc should be kept with a proper size instead of as
small as possible. Keep memory up to 64KB from reclaims when uncharging
sk_buff memory, which is closer to the maximum size of sk_buff. It will
help reduce the frequency of allocating memory during TCP connection.
The original reclaim threshold for reserved memory per-socket is 2MB, so
the extraneous memory reserved now is about 32 times less than before
commit 4890b686f408 ("net: keep sk->sk_forward_alloc as small as
possible").
Run memcached with memtier_benchamrk to verify the optimization fix. 8
server-client pairs are created with bridge network on localhost, server
and client of the same pair share 28 logical CPUs.
Results (Average for 5 run)
RPS (with/without patch) +2.07x
Fixes: 4890b686f408 ("net: keep sk->sk_forward_alloc as small as possible")
Signed-off-by: Cathy Zhang <cathy.zhang@...el.com>
Signed-off-by: Lizhen You <lizhen.you@...el.com>
Tested-by: Long Tao <tao.long@...el.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@...el.com>
Reviewed-by: Tim Chen <tim.c.chen@...ux.intel.com>
Reviewed-by: Suresh Srinivas <suresh.srinivas@...el.com>
---
include/net/sock.h | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 8b7ed7167243..6d2960479a80 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1657,12 +1657,33 @@ static inline void sk_mem_charge(struct sock *sk, int size)
sk->sk_forward_alloc -= size;
}
+/* The following macro controls memory reclaiming in sk_mem_uncharge().
+ */
+#define SK_RECLAIM_THRESHOLD (1 << 16)
static inline void sk_mem_uncharge(struct sock *sk, int size)
{
+ int reclaimable;
+
if (!sk_has_account(sk))
return;
sk->sk_forward_alloc += size;
- sk_mem_reclaim(sk);
+
+ reclaimable = sk->sk_forward_alloc - sk_unused_reserved_mem(sk);
+
+ /* Reclaim memory to reduce memory pressure when multiple sockets
+ * run in parallel. However, if we reclaim all pages and keep
+ * sk->sk_forward_alloc as small as possible, it will cause
+ * paths like tcp_rcv_established() going to the slow path with
+ * much higher rate for forwarded memory expansion, which leads
+ * to contention hot points and performance drop.
+ *
+ * In order to avoid the above issue, it's necessary to keep
+ * sk->sk_forward_alloc with a proper size while doing reclaim.
+ */
+ if (reclaimable > SK_RECLAIM_THRESHOLD) {
+ reclaimable -= SK_RECLAIM_THRESHOLD;
+ __sk_mem_reclaim(sk, reclaimable);
+ }
}
/*
--
2.34.1
Powered by blists - more mailing lists