[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKBUJ6p56+3TRNB5JAn0bmuRPDWLeOwGmvLh5yjwnDasA@mail.gmail.com>
Date: Fri, 22 Nov 2024 17:44:33 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, davem@...emloft.net, pabeni@...hat.com,
jhs@...atatu.com, xiyou.wangcong@...il.com, jiri@...nulli.us
Subject: Re: [PATCH net] net_sched: sch_fq: don't follow the fast path if Tx
is behind now
On Fri, Nov 22, 2024 at 5:21 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> Recent kernels cause a lot of TCP retransmissions
>
> [ ID] Interval Transfer Bitrate Retr Cwnd
> [ 5] 0.00-1.00 sec 2.24 GBytes 19.2 Gbits/sec 2767 442 KBytes
> [ 5] 1.00-2.00 sec 2.23 GBytes 19.1 Gbits/sec 2312 350 KBytes
> ^^^^
>
> Replacing the qdisc with pfifo makes them go away. It appears that
> a flow may get throttled with a very near unthrottle time.
> Later we may get busy processing Rx and the unthrottling time will
> pass, but we won't service Tx since the core is busy with Rx.
> If Rx sees an ACK and we try to push more data for the throttled flow
> we may fastpath the skb, not realizing that there are already "ready
> to send" packets for this flow sitting in the qdisc.
> At least this is my theory on what happens.
>
> Don't trust the fastpath if we are "behind" according to the projected
> unthrottle time for some flow waiting in the Qdisc.
>
> Qdisc config:
>
> qdisc fq 8001: dev eth0 parent 1234:1 limit 10000p flow_limit 100p \
> buckets 32768 orphan_mask 1023 bands 3 \
> priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 \
> weights 589824 196608 65536 quantum 3028b initial_quantum 15140b \
> low_rate_threshold 550Kbit \
> refill_delay 40ms timer_slack 10us horizon 10s horizon_drop
>
> For iperf this change seems to do fine, the reordering is gone.
> The fastpath still gets used most of the time:
>
> gc 0 highprio 0 fastpath 142614 throttled 418309 latency 19.1us
> xx_behind 2731
>
> where "xx_behind" counts how many times we hit the new return false.
>
> Fixes: 076433bd78d7 ("net_sched: sch_fq: add fast path for mostly idle qdisc")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> CC: jhs@...atatu.com
> CC: xiyou.wangcong@...il.com
> CC: jiri@...nulli.us
> ---
> net/sched/sch_fq.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
> index 19a49af5a9e5..3d932b262159 100644
> --- a/net/sched/sch_fq.c
> +++ b/net/sched/sch_fq.c
> @@ -331,6 +331,12 @@ static bool fq_fastpath_check(const struct Qdisc *sch, struct sk_buff *skb,
> */
> if (q->internal.qlen >= 8)
> return false;
> +
> + /* Ordering invariants fall apart if some throttled flows
> + * are ready but we haven't serviced them, yet.
> + */
> + if (q->throttled_flows && q->time_next_delayed_flow <= now)
> + return false;
> }
Interesting... I guess we could also call fq_check_throttled() to
refresh a better view of the qdisc state ?
But perhaps your patch is simpler. I guess it could be reduced to
if (q->time_next_delayed_flow <= now + q->offload_horizon)
return false;
(Note the + q->offload_horizon)
I do not think testing q->throttled_flows is strictly needed :
If 0, then q->time_next_delayed_flow is set to ~0ULL.
Powered by blists - more mailing lists