lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iKCzrZgW1jKLDmkRKpMnK3upw0whRAcqdtF5f07D2i7HQ@mail.gmail.com>
Date: Sun, 24 Nov 2024 18:48:18 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, davem@...emloft.net, pabeni@...hat.com, 
	stable@...r.kernel.org, jhs@...atatu.com, xiyou.wangcong@...il.com, 
	jiri@...nulli.us
Subject: Re: [PATCH net v2] net_sched: sch_fq: don't follow the fast path if
 Tx is behind now

On Sun, Nov 24, 2024 at 3:21 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> Recent kernels cause a lot of TCP retransmissions
>
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec  2.24 GBytes  19.2 Gbits/sec  2767    442 KBytes
> [  5]   1.00-2.00   sec  2.23 GBytes  19.1 Gbits/sec  2312    350 KBytes
>                                                       ^^^^
>
> Replacing the qdisc with pfifo makes retransmissions go away.
>
> It appears that a flow may have a delayed packet with a very near
> Tx time. Later, we may get busy processing Rx and the target Tx time
> will pass, but we won't service Tx since the CPU is busy with Rx.
> If Rx sees an ACK and we try to push more data for the delayed flow
> we may fastpath the skb, not realizing that there are already "ready
> to send" packets for this flow sitting in the qdisc.
>
> Don't trust the fastpath if we are "behind" according to the projected
> Tx time for next flow waiting in the Qdisc. Because we consider anything
> within the offload window to be okay for fastpath we must consider
> the entire offload window as "now".
>
> Qdisc config:
>
> qdisc fq 8001: dev eth0 parent 1234:1 limit 10000p flow_limit 100p \
>   buckets 32768 orphan_mask 1023 bands 3 \
>   priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 \
>   weights 589824 196608 65536 quantum 3028b initial_quantum 15140b \
>   low_rate_threshold 550Kbit \
>   refill_delay 40ms timer_slack 10us horizon 10s horizon_drop
>
> For iperf this change seems to do fine, the reordering is gone.
> The fastpath still gets used most of the time:
>
>   gc 0 highprio 0 fastpath 142614 throttled 418309 latency 19.1us
>    xx_behind 2731
>
> where "xx_behind" counts how many times we hit the new "return false".
>
> CC: stable@...r.kernel.org
> Fixes: 076433bd78d7 ("net_sched: sch_fq: add fast path for mostly idle qdisc")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>

Reviewed-by: Eric Dumazet <edumazet@...gle.com>

Thanks !

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ