[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1477927006.7065.304.camel@edumazet-glaptop3.roam.corp.google.com>
Date: Mon, 31 Oct 2016 08:16:46 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
James Morris <jmorris@...ei.org>,
Trond Myklebust <trond.myklebust@...marydata.com>,
Alexander Duyck <alexander.duyck@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Eric Dumazet <edumazet@...gle.com>,
Tom Herbert <tom@...bertland.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
linux-nfs@...r.kernel.org
Subject: Re: [PATCH net-next] udp: do fwd memory scheduling on dequeue
On Mon, 2016-10-31 at 16:02 +0100, Paolo Abeni wrote:
>
> No problem at all with incremental patches ;-)
>
> In our experiment, touching udp_memory_allocated is only a part of the
> the source of contention, with the biggest source of contention being
> the sk_rmem_alloc update - which happens on every dequeue.
>
> We experimented doing fwd alloc of the whole sk_rcvbuf; even in that
> scenario we hit relevant contention if sk_rmem_alloc update was done
> under the lock, while full sk_rcvbuf forward allocation and
> sk_rmem_alloc update outside the spinlock gave very similar performance
> to our posted patch.
> I think that the next step (after the double lock on dequeue removal)
> should be moving sk_rmem_alloc outside the lock: the needed changes for
> doing that on top of double lock on dequeue removal are very small
> (would add ~10 lines of code).
>
During my load tests, one of the major factor was sk_drops being
incremented like crazy, dirtying a critical cache line, hurting
sk_filter reads for example.
/* --- cacheline 6 boundary (384 bytes) --- */
struct {
atomic_t rmem_alloc; /* 0x180 0x4 */
int len; /* 0x184 0x4 */
struct sk_buff * head; /* 0x188 0x8 */
struct sk_buff * tail; /* 0x190 0x8 */
} sk_backlog; /* 0x180 0x18 */
int sk_forward_alloc; /* 0x198 0x4 */
__u32 sk_txhash; /* 0x19c 0x4 */
unsigned int sk_napi_id; /* 0x1a0 0x4 */
unsigned int sk_ll_usec; /* 0x1a4 0x4 */
atomic_t sk_drops; /* 0x1a8 0x4 */
int sk_rcvbuf; /* 0x1ac 0x4 */
struct sk_filter * sk_filter; /* 0x1b0 0x8 */
union {
struct socket_wq * sk_wq; /* 0x8 */
struct socket_wq * sk_wq_raw; /* 0x8 */
}; /* 0x1b8 0x8 */
I was playing moving this field elsewhere and not reading it if not
necessary.
diff --git a/include/net/sock.h b/include/net/sock.h
index f13ac87a8015cb18c5d3fe5fdcf2d6a0592428f4..a901df591eb45e153517cdb8b409b61563d1a4e3 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2112,7 +2112,8 @@ struct sock_skb_cb {
static inline void
sock_skb_set_dropcount(const struct sock *sk, struct sk_buff *skb)
{
- SOCK_SKB_CB(skb)->dropcount = atomic_read(&sk->sk_drops);
+ SOCK_SKB_CB(skb)->dropcount = sock_flag(sk, SOCK_RXQ_OVFL) ?
+ atomic_read(&sk->sk_drops) : 0;
}
static inline void sk_drops_add(struct sock *sk, const struct sk_buff *skb)
Powered by blists - more mailing lists