lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1427800111.2057.18.camel@sipsolutions.net>
Date:	Tue, 31 Mar 2015 13:08:31 +0200
From:	Johannes Berg <johannes@...solutions.net>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Michal Kazior <michal.kazior@...to.com>,
	Neal Cardwell <ncardwell@...gle.com>,
	linux-wireless <linux-wireless@...r.kernel.org>,
	Network Development <netdev@...r.kernel.org>,
	Eyal Perry <eyalpe@....mellanox.co.il>
Subject: Re: Throughput regression with `tcp: refine TSO autosizing`

To revive an old thread ...

On Mon, 2015-02-09 at 07:11 -0800, Eric Dumazet wrote:


> Ideally the formula would be in TCP something very fast to compute :
> 
> amount = (sk->sk_pacing_rate >> 10) + sk->tx_completion_delay_cushion;
> limit = max(2 * skb->truesize, amount);
> limit = min_t(u32, limit, sysctl_tcp_limit_output_bytes);
> 
> So a 'problematic' driver would have to do the math (64 bit maths) like
> this :
> 
> 
> sk->tx_completion_delay_cushion = ewma_tx_delay * sk->sk_pacing_rate;

The whole notion with "ewma_tx_delay" seems very wrong to me. Measuring
something while also trying to control it (or something closely related)
seems a bit strange, but perhaps you meant to measure something
different than what Michal implemented.

What he implemented was measuring the time it takes from submission to
the hardware queues, but that seems to create a bad feedback cycle:
Allowing it as extra transmit "cushion" will, IIUC, cause TCP to submit
more data to the queues, which will in turn cause the next packets to be
potentially delayed more (since there are still waiting ones) thus
causing a longer delay to be measured, which in turn allows even more
data to be submitted etc.

IOW, while traffic is flowing this will likely cause feedback that
completely removes the point of this, no? Leaving only
sysctl_tcp_limit_output_bytes as the limit (*).

It seems it'd be better to provide a calculated estimate, perhaps based
on current transmit rate and (if available) CCA/TXOP acquisition time.

johannes

(*) Which, btw, isn't all that big given that a maximally sized A-MPDU
is like 1MB containing close to that much actual data! Don't think that
can actually be done at current transmit rates though.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ