[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fb7a1324-41c6-4e10-a6a3-f16d96f44f65@redhat.com>
Date: Thu, 9 Jan 2025 13:11:34 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>,
Toke Høiland-Jørgensen <toke@...e.dk>,
Jamal Hadi Salim <jhs@...atatu.com>, Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>
Cc: syzbot+f63600d288bfb7057424@...kaller.appspotmail.com,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Simon Horman <horms@...nel.org>,
cake@...ts.bufferbloat.net, netdev@...r.kernel.org
Subject: Re: [PATCH net v2] sched: sch_cake: add bounds checks to host bulk
flow fairness counts
On 1/7/25 1:01 PM, Toke Høiland-Jørgensen wrote:
> Even though we fixed a logic error in the commit cited below, syzbot
> still managed to trigger an underflow of the per-host bulk flow
> counters, leading to an out of bounds memory access.
>
> To avoid any such logic errors causing out of bounds memory accesses,
> this commit factors out all accesses to the per-host bulk flow counters
> to a series of helpers that perform bounds-checking before any
> increments and decrements. This also has the benefit of improving
> readability by moving the conditional checks for the flow mode into
> these helpers, instead of having them spread out throughout the
> code (which was the cause of the original logic error).
>
> v2:
> - Remove now-unused srchost and dsthost local variables in cake_dequeue()
Small nit: the changelog should come after the '---' separator. No need
to repost just for this.
> Fixes: 546ea84d07e3 ("sched: sch_cake: fix bulk flow accounting logic for host fairness")
> Reported-by: syzbot+f63600d288bfb7057424@...kaller.appspotmail.com
> Signed-off-by: Toke Høiland-Jørgensen <toke@...hat.com>
> ---
> net/sched/sch_cake.c | 140 +++++++++++++++++++++++--------------------
> 1 file changed, 75 insertions(+), 65 deletions(-)
>
> diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
> index 8d8b2db4653c..2c2e2a67f3b2 100644
> --- a/net/sched/sch_cake.c
> +++ b/net/sched/sch_cake.c
> @@ -627,6 +627,63 @@ static bool cake_ddst(int flow_mode)
> return (flow_mode & CAKE_FLOW_DUAL_DST) == CAKE_FLOW_DUAL_DST;
> }
>
> +static void cake_dec_srchost_bulk_flow_count(struct cake_tin_data *q,
> + struct cake_flow *flow,
> + int flow_mode)
> +{
> + if (likely(cake_dsrc(flow_mode) &&
> + q->hosts[flow->srchost].srchost_bulk_flow_count))
> + q->hosts[flow->srchost].srchost_bulk_flow_count--;
> +}
> +
> +static void cake_inc_srchost_bulk_flow_count(struct cake_tin_data *q,
> + struct cake_flow *flow,
> + int flow_mode)
> +{
> + if (likely(cake_dsrc(flow_mode) &&
> + q->hosts[flow->srchost].srchost_bulk_flow_count < CAKE_QUEUES))
> + q->hosts[flow->srchost].srchost_bulk_flow_count++;
> +}
> +
> +static void cake_dec_dsthost_bulk_flow_count(struct cake_tin_data *q,
> + struct cake_flow *flow,
> + int flow_mode)
> +{
> + if (likely(cake_ddst(flow_mode) &&
> + q->hosts[flow->dsthost].dsthost_bulk_flow_count))
> + q->hosts[flow->dsthost].dsthost_bulk_flow_count--;
> +}
> +
> +static void cake_inc_dsthost_bulk_flow_count(struct cake_tin_data *q,
> + struct cake_flow *flow,
> + int flow_mode)
> +{
> + if (likely(cake_ddst(flow_mode) &&
> + q->hosts[flow->dsthost].dsthost_bulk_flow_count < CAKE_QUEUES))
> + q->hosts[flow->dsthost].dsthost_bulk_flow_count++;
> +}
> +
> +static u16 cake_get_flow_quantum(struct cake_tin_data *q,
> + struct cake_flow *flow,
> + int flow_mode)
> +{
> + u16 host_load = 1;
> +
> + if (cake_dsrc(flow_mode))
> + host_load = max(host_load,
> + q->hosts[flow->srchost].srchost_bulk_flow_count);
> +
> + if (cake_ddst(flow_mode))
> + host_load = max(host_load,
> + q->hosts[flow->dsthost].dsthost_bulk_flow_count);
> +
> + /* The get_random_u16() is a way to apply dithering to avoid
> + * accumulating roundoff errors
> + */
> + return (q->flow_quantum * quantum_div[host_load] +
> + get_random_u16()) >> 16;
dithering is now applied on both enqueue and dequeue, while prior to
this patch it only happened on dequeue. Is that intentional? can't lead
to (small) flow_deficit increase?
Thanks!
Paolo
Powered by blists - more mailing lists