netdev - Re: [PATCH net v5 1/2] net: sched: tbf: fix calculation of max

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1386245729.30495.167.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Thu, 05 Dec 2013 04:15:29 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Yang Yingliang <yangyingliang@...wei.com>
Cc:	davem@...emloft.net, netdev@...r.kernel.org, brouer@...hat.com,
	jpirko@...hat.com, jbrouer@...hat.com
Subject: Re: [PATCH net v5 1/2] net: sched: tbf: fix calculation of max_size

On Thu, 2013-12-05 at 15:10 +0800, Yang Yingliang wrote:
> Current max_size is caluated from rate table. Now, the rate table
> has been replaced and it's wrong to caculate max_size based on this
> rate table. It can lead wrong calculation of max_size.
> 
> The burst in kernel may be lower than user asked, because burst may gets
> some loss when transform it to buffer(E.g. "burst 40kb rate 30mbit/s")
> and it seems we cannot avoid this loss. Burst's value(max_size) based on
> rate table may be equal user asked. If a packet's length is max_size, this
> packet will be stalled in tbf_dequeue() because its length is above the
> burst in kernel so that it cannot get enough tokens. The max_size guards
> against enqueuing packet sizes above q->buffer "time" in tbf_enqueue().
> 
> To make consistent with the calculation of tokens, this patch add a helper
> psched_ns_t2l() to calculate burst(max_size) directly to fix this problem.
> 
> After this fix, we can support to using 64bit rates to calculate burst as well.
> 
> Suggested-by: Jesper Dangaard Brouer <jbrouer@...hat.com>
> Suggested-by: Eric Dumazet <edumazet@...gle.com>

No, I did not suggest this patch.

> Signed-off-by: Yang Yingliang <yangyingliang@...wei.com>
> ---
>  include/net/sch_generic.h | 46 ++++++++++++++++++++++++++++
>  net/sched/sch_tbf.c       | 76 ++++++++++++++++++++---------------------------
>  2 files changed, 79 insertions(+), 43 deletions(-)
> 
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index d0a6321..8da64f3 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -701,6 +701,52 @@ static inline u64 psched_l2t_ns(const struct psched_ratecfg *r,
>  	return ((u64)len * r->mult) >> r->shift;
>  }
>  
> +/* Time to Length, convert time in ns to length in bytes
> + * to determinate how many bytes can be sent in given time.
> + */
> +static inline u64 psched_ns_t2l(const struct psched_ratecfg *r,
> +				u64 time_in_ns)

inline ?

Really ? 

This is management path, there is no point inlining this.

> +{
> +	u64 len = time_in_ns;
> +	u8 shift = r->shift;
> +	bool is_div = false;
> +
> +	/* The formula is :
> +	 * len = (time_in_ns << shift) / mult
> +	 * when time_in_ns does shift, it would overflow.
> +	 * If overflow happens first time, do division.
> +	 * Then do shift. If it happens again,
> +	 * set lenth to ~0ULL.
> +	 */
> +	while (shift) {
> +		if (len & (1ULL << 63)) {
> +			if (!is_div) {
> +				len = div64_u64(len, r->mult);
> +				is_div = true;
> +			} else {
> +				/* overflow happens */
> +				len = ~0ULL;
> +				is_div = true;
> +				break;
> +			}
> +		}
> +		len <<= 1;
> +		shift--;
> +	}
> +	if (!is_div)
> +		len = div64_u64(len, r->mult);

Thats terrible.

Given that the intended formula was :

time_in_ns = (NSEC_PER_SEC * len) / rate_bps;

This translates to following optimal C code

u64 len = time_in_ns * r->rate_bytes_ps;
do_div(len, NSEC_PER_SEC);

Why do you use r->shift and r->mult which are optimized for the reverse
operation in fast path (no divide), I do not know.

> +	max_size = min_t(u64, psched_ns_t2l(&q->rate, q->buffer), ~0U); 
> +
> +	if (qopt->peakrate.rate) {
>  		if (tb[TCA_TBF_PRATE64])
>  			prate64 = nla_get_u64(tb[TCA_TBF_PRATE64]);
> -		psched_ratecfg_precompute(&q->peak, &ptab->rate, prate64);
> +		psched_ratecfg_precompute(&q->peak, &qopt->peakrate, prate64);
> +		if (q->peak.rate_bytes_ps <= q->rate.rate_bytes_ps) {
> +			pr_err("Peakrate must be higher than rate!\n");

Do not add messages like that without rate limiting them.

Plus there is no context, we know nothing.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html