[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1427800111.2057.18.camel@sipsolutions.net>
Date: Tue, 31 Mar 2015 13:08:31 +0200
From: Johannes Berg <johannes@...solutions.net>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Michal Kazior <michal.kazior@...to.com>,
Neal Cardwell <ncardwell@...gle.com>,
linux-wireless <linux-wireless@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
Eyal Perry <eyalpe@....mellanox.co.il>
Subject: Re: Throughput regression with `tcp: refine TSO autosizing`
To revive an old thread ...
On Mon, 2015-02-09 at 07:11 -0800, Eric Dumazet wrote:
> Ideally the formula would be in TCP something very fast to compute :
>
> amount = (sk->sk_pacing_rate >> 10) + sk->tx_completion_delay_cushion;
> limit = max(2 * skb->truesize, amount);
> limit = min_t(u32, limit, sysctl_tcp_limit_output_bytes);
>
> So a 'problematic' driver would have to do the math (64 bit maths) like
> this :
>
>
> sk->tx_completion_delay_cushion = ewma_tx_delay * sk->sk_pacing_rate;
The whole notion with "ewma_tx_delay" seems very wrong to me. Measuring
something while also trying to control it (or something closely related)
seems a bit strange, but perhaps you meant to measure something
different than what Michal implemented.
What he implemented was measuring the time it takes from submission to
the hardware queues, but that seems to create a bad feedback cycle:
Allowing it as extra transmit "cushion" will, IIUC, cause TCP to submit
more data to the queues, which will in turn cause the next packets to be
potentially delayed more (since there are still waiting ones) thus
causing a longer delay to be measured, which in turn allows even more
data to be submitted etc.
IOW, while traffic is flowing this will likely cause feedback that
completely removes the point of this, no? Leaving only
sysctl_tcp_limit_output_bytes as the limit (*).
It seems it'd be better to provide a calculated estimate, perhaps based
on current transmit rate and (if available) CCA/TXOP acquisition time.
johannes
(*) Which, btw, isn't all that big given that a maximally sized A-MPDU
is like 1MB containing close to that much actual data! Don't think that
can actually be done at current transmit rates though.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists