[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1318579509.2533.110.camel@edumazet-laptop>
Date: Fri, 14 Oct 2011 10:05:09 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH net-next] tcp: reduce memory needs of out of order queue
Le vendredi 14 octobre 2011 à 03:42 -0400, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Fri, 14 Oct 2011 09:19:51 +0200
>
> > Many drivers allocates big skb to store a single TCP frame.
> > (WIFI drivers, or NIC using PAGE_SIZE fragments)
> >
> > Its now common to get skb->truesize bigger than 4096 to store a ~1500
> > bytes TCP frame.
> >
> > TCP sessions with large RTT and packet losses can fill their Out Of
> > Order queue with such oversized skbs, and hit their sk_rcvbuf limit,
> > starting a pruning of complete OFO queue, without giving chance to
> > receive the missing packet(s) and moving skbs from OFO to receive queue.
> >
> > This patch adds skb_reduce_truesize() helper, and uses it for all skbs
> > queued into OFO queue.
> >
> > Spending some time to perform a copy is worth the pain, since it permits
> > SACK processing to have a chance to complete over the RTT barrier.
> >
> > This greatly improves user experience, without added cost on fast path.
> >
> > Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
>
> No objection from me, although I wish wireless drivers were able to
> size their SKBs more appropriately. I wonder how many problems that
> look like "OMG we gotz da Buffer Bloat, arrr!" are actually due to
> this truesize issue.
>
> I think such large truesize SKBs will cause problems even in non loss
> situations, in that the receive buffer will hit it's limits more
> quickly. I not sure that the receive buffer autotuning is built to
> handle this sort of scenerio as a common occurance.
>
> You might want to check if this is the actual root cause of your
> problems. If the receive buffer autotuning doesn't expand the receive
> buffer enough to hold two windows worth of these large truesize SKBs,
> that's the real reason why we end up pruning.
>
> We have to decide if these kinds of SKBs are acceptable as a normal
> situation for MSS sized frames. And if they are then it's probably
> a good idea to adjust the receive buffer autotuning code too.
>
> Although I realize it might be difficult, getting rid of these weird
> SKBs in the first place would be ideal.
>
> It would also be a good idea to put the truesize inaccuracies into
> perspective when selecting how to fix this. It's trying to prevent
> 1 byte packets not accounting for the 256 byte SKB and metadata.
> That kind of case with such a high ratio of wastage is important.
>
> On the other hand, using 2048 bytes for a 1500 byte packet and claiming
> the truesize is 1500 + sizeof(metadata)... that might be an acceptable
> lie to tell :-) This is especially true if it allows an easy solution
> to this wireless problem.
>
> Just some thoughts... and I wonder if the wireless thing is due to
> some hardware limitation or similar.
>
This patch specifically addresses the OFO problem, trying to lower
memory usage for machines handling lot of sockets (proxies for example)
For the general case, I believe we have to tune/change
tcp_win_from_space() to take into account general tendancy to get fat
skbs.
sysctl_tcp_adv_win_scale is not fine enough today, and default value (2)
gives too much collapses. It's also a very complex setting, I am pretty
sure nobody knows how to use it.
tcp_win_from_space(int space) -> 75% of space [ default ]
Only current kernels choices are to set it to one/-1 :
tcp_win_from_space(int space) -> 50% of space
or -2 :
tcp_win_from_space(int space) -> 25% of space
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists