[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120714145333.2c441166@nehalam.linuxnetplumber.net>
Date: Sat, 14 Jul 2012 14:53:33 -0700
From: Stephen Hemminger <shemminger@...tta.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: David Miller <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Hagen Paul Pfeifer <hagen@...u.net>,
Mark Gordon <msg@...gle.com>,
Andreas Terzis <aterzis@...gle.com>,
Yuchung Cheng <ycheng@...gle.com>
Subject: Re: [PATCH net-next] netem: refine early skb orphaning
On Sat, 14 Jul 2012 15:16:27 +0200
Eric Dumazet <eric.dumazet@...il.com> wrote:
> From: Eric Dumazet <edumazet@...gle.com>
>
> netem does an early orphaning of skbs. Doing so breaks TCP Small Queue
> or any mechanism relying on socket sk_wmem_alloc feedback.
>
> Ideally, we should perform this orphaning after the rate module and
> before the delay module, to mimic what happens on a real link :
>
> skb orphaning is indeed normally done at TX completion, before the
> transit on the link.
>
> +-------+ +--------+ +---------------+ +-----------------+
> + Qdisc +---> Device +--> TX completion +--> links / hops +->
> + + + xmit + + skb orphaning + + propagation +
> +-------+ +--------+ +---------------+ +-----------------+
> < rate limiting > < delay, drops, reorders >
>
> If netem is used without delay feature (drops, reorders, rate
> limiting), then we should avoid early skb orphaning, to keep pressure
> on sockets as long as packets are still in qdisc queue.
>
> Ideally, netem should be refactored to implement delay module
> as the last stage. Current algorithm merges the two phases
> (rate limiting + delay) so its not correct.
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Cc: Hagen Paul Pfeifer <hagen@...u.net>
> Cc: Mark Gordon <msg@...gle.com>
> Cc: Andreas Terzis <aterzis@...gle.com>
> Cc: Yuchung Cheng <ycheng@...gle.com>
> ---
> net/sched/sch_netem.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index c412ad0..298c0dd 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -380,7 +380,14 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
> return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
> }
>
> - skb_orphan(skb);
> + /* If a delay is expected, orphan the skb. (orphaning usually takes
> + * place at TX completion time, so _before_ the link transit delay)
> + * Ideally, this orphaning should be done after the rate limiting
> + * module, because this breaks TCP Small Queue, and other mechanisms
> + * based on socket sk_wmem_alloc.
> + */
> + if (q->latency || q->jitter)
> + skb_orphan(skb);
>
> /*
> * If we need to duplicate packet, then re-insert at top of the
>
Acked-by: Stephen Hemminger <shemminger@...tta.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists