[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1285060993.2617.163.camel@edumazet-laptop>
Date: Tue, 21 Sep 2010 11:23:13 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Amit Salecha <amit.salecha@...gic.com>
Cc: David Miller <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Ameen Rahman <ameen.rahman@...gic.com>,
Anirban Chakraborty <anirban.chakraborty@...gic.com>
Subject: RE: [PATCH] qlcnic: dont assume NET_IP_ALIGN is 2
Le mardi 21 septembre 2010 à 03:41 -0500, Amit Salecha a écrit :
> > Amit, if you believe this is a problem, you should address it for all
> > NICS, not only qlcnic.
> >
> > Qlcnic was lying to stack, because it consumed 2Kbytes blocs and
> > pretended they were consuming skb->len bytes.
> > (assuming MTU=1500, problem is worse if MTU is bigger)
> >
> > So in order to improve "throughput", you were allowing for memory
> > exhaust and freeze of the _machine_ ?
> >
> This won't lead to such problem. truesize is used for accounting only.
You must have big machines in your lab and never hit OOM ?
You really should take a look on various files in net/core, net/ipv4
trees. And files like "/proc/sys/net/tcp_mem", "/proc/sys/net/udp_mem"
In fact, truesize is _underestimated_ : (we dont account for struct
skb_shared_info) and kmalloc() rounding
We probably should use this patch (without having to check all possible
net drivers !)
Problem is this would slow down alloc_skb(), so this patch is not for
inclusion.
cheap alternative would be to use
size + sizeof(struct sk_buff) + SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
If you think about it, when 128bit arches come, truesize will grow anyway.
If some tuning is needed in our stack, we'll do it.
(socket api SO_RCVBUF/ SO_SNDBUF is the problem, because
applications are not aware of packetization or kernel internals)
SOCK_MIN_RCVBUF is way too small, since sizeof(struct sk_buff)
is already close to 256. I guess we cannot even receive a single frame.
include/net/sock.h | 2 +-
net/core/skbuff.c | 2 +-
net/core/sock.c | 8 ++++----
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 8ae97c4..348fc9e 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1558,7 +1558,7 @@ static inline void sk_wake_async(struct sock *sk, int how, int band)
}
#define SOCK_MIN_SNDBUF 2048
-#define SOCK_MIN_RCVBUF 256
+#define SOCK_MIN_RCVBUF 1024
static inline void sk_stream_moderate_sndbuf(struct sock *sk)
{
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 752c197..5ab2e8e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -196,7 +196,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
* the tail pointer in struct sk_buff!
*/
memset(skb, 0, offsetof(struct sk_buff, tail));
- skb->truesize = size + sizeof(struct sk_buff);
+ skb->truesize = ksize(data) + sizeof(struct sk_buff);
atomic_set(&skb->users, 1);
skb->head = data;
skb->data = data;
diff --git a/net/core/sock.c b/net/core/sock.c
index f3a06c4..803e041 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -535,10 +535,10 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
val = sysctl_wmem_max;
set_sndbuf:
sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
- if ((val * 2) < SOCK_MIN_SNDBUF)
+ if ((val * 4) < SOCK_MIN_SNDBUF)
sk->sk_sndbuf = SOCK_MIN_SNDBUF;
else
- sk->sk_sndbuf = val * 2;
+ sk->sk_sndbuf = val * 4;
/*
* Wake up sending tasks if we
@@ -579,10 +579,10 @@ set_rcvbuf:
* returning the value we actually used in getsockopt
* is the most desirable behavior.
*/
- if ((val * 2) < SOCK_MIN_RCVBUF)
+ if ((val * 4) < SOCK_MIN_RCVBUF)
sk->sk_rcvbuf = SOCK_MIN_RCVBUF;
else
- sk->sk_rcvbuf = val * 2;
+ sk->sk_rcvbuf = val * 4;
break;
case SO_RCVBUFFORCE:
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists