[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YS38YB9JTSHeYgJG@dcaratti.users.ipa.redhat.com>
Date: Tue, 31 Aug 2021 11:54:40 +0200
From: Davide Caratti <dcaratti@...hat.com>
To: Linux Kernel Network Developers <netdev@...r.kernel.org>
Cc: xiyou.wangcong@...il.com, davem@...emloft.net, jhs@...atatu.com,
jiri@...nulli.us, kuba@...nel.org, liuhangbin@...il.com,
petrm@...lanox.com
Subject: Re: [PATH net] net/sched: ets: fix crash when flipping from 'strict'
to 'quantum'
hello Cong, thanks a lot for looking at this!
On Mon, Aug 30, 2021 at 05:43:09PM -0700, Cong Wang wrote:
> On Tue, Aug 24, 2021 at 3:34 PM Davide Caratti <dcaratti@...hat.com> wrote:
> > When the change() function decreases the value of 'nstrict', we must take
> > into account that packets might be already enqueued on a class that flips
> > from 'strict' to 'quantum': otherwise that class will not be added to the
> > bandwidth-sharing list. Then, a call to ets_qdisc_reset() will attempt to
> > do list_del(&alist) with 'alist' filled with zero, hence the NULL pointer
> > dereference.
>
> I am confused about how we end up having NULL in list head.
>
> From your changelog, you imply it happens when we change an existing
> Qdisc, but I don't see any place that could set this list head to NULL.
> list_del() clearly doesn't set NULL.
right, the problem happens when we try to *decrease* the value of 'nstrict'
while traffic is being loaded.
ETS classes from 0 to nstrict - 1 don't initialize this list at all, nor they
use it. The initialization happens when the first packet is enqueued to one of
the bandwidth-sharing classes (from nstrict to nbands - 1), see [1]).
However, if we modify the value of 'nstrict' while q->classes[i].qdisc->q.len is
greater than zero, the list initialization is probably not going to happen, so
the struct
q->classes[i].alist
remains filled with zero even since the first allocation, like you mentioned below:
> But if it is a new Qdisc, Qdisc is allocated with zero's hence having NULL
> as list head. However, in this case, q->nstrict should be 0 before the loop,
^^ this. But you are wrong, q->nstrict can be any value from 0 to 16 (inclusive)
before the loop. As the value of 'nstrict' is reduced (e.g. from 4 to 0), the code
can reach the loop in ets_qdisc_change() with the following 4 conditions
simultaneously true:
1) nstrict = 0
2) q->nstrict = 4
3) q->classes[2].qdisc->q.qlen greater than zero
4) q->classes[2].alist filled with 0
then, the value of q->nstrict is updated to 0. After that, ets_qdisc_reset() can be
called on unpatched Linux with the following 3 conditions simultaneously true:
a) q->nstrict = 0
b) q->classes[2].qdisc->q.qlen greater than zero
c) q->classes[2].alist filled with 0
that leads to the reported crash.
> so I don't think your code helps at all as the for loop can't even be entered?
The first code I tried was just doing INIT_LIST_HEAD(&q->classes[i].alist), with
i ranging from nstrict to q->nstrict - 1: it fixed the crash in ets_qdisc_reset().
But then I observed that classes changing from strict to quantum were not sharing
any bandwidth at all, and they had a non-empty backlog even after I stopped all
the traffic: so, I added the
if (q->classes[i].qdisc->q.qlen) {
list_add_tail(&q->classes[i].alist, &q->active);
q->classes[i].deficit = quanta[i];
}
part, that updates the DRR list with non-empty classes that are changing from
strict to DRR. After that, I verified that all classes were depleted correctly.
On a second thought, this INIT_LIST_HEAD(&q->classes[i].alist) line is no more
useful. If q->classes[i].qdisc->q.qlen is 0, either it remains 0 until the call
to ets_qdisc_reset(), or some packet is enqueued after q->nstrict is updated,
and the enqueueing of a 'first' packet [1] will fix the value of
q->classes[i].alist.
Finally, if q->classes[i].qdisc->q.qlen is bigger than zero, the list_add_tail()
part of my patch covers the missing update of the DRR list. In all these cases,
the NULL dereference in ets_qdisc_reset() is prevented.
So, I can probably send a patch (for net-next, when it reopens) that removes
this INIT_LIST_HEAD() line; anyway, its presence is harmless IMO. WDYT?
>
> Thanks.
BTW, please find below a reproducer that's more efficient than kselftests in
seeing the problem:
1 #!/bin/bash
2 ip link del dev ddd0 >/dev/null 2>&1
3 ip link add dev ddd0 type dummy
4 ip link set dev ddd0 up
5 tc qdisc add dev ddd0 handle 1: root tbf rate 100kbit latency 1000ms burst 1Mbit
6 tc qdisc add dev ddd0 handle 10: parent 1: ets bands 4 strict 4 priomap 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
7 tc qdisc show dev ddd0
8 mausezahn ddd0 -A 10.10.10.1 -B 10.10.10.2 -c 0 -a own -b 00:c1:a0:c1:a0:00 -t udp &
9 mpid=$!
10 sleep 5
11 tc qdisc change dev ddd0 handle 10: ets bands 8 quanta 2500 2500 2500 2500 2500 2500 2500 2500 priomap 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
12 tc qdisc del dev ddd0 root
13 sleep 1
14 tc qdisc show dev ddd0
15 kill $mpid
16 rmmod sch_ets
Line 11 changes the ets qdisc from 4 prio to 8 DRR, so the value of q->nstrict
changes from 4 to 0. The crash in ets_qdisc_reset() happens with band number #2
(that's the only class being loaded with traffic during the test).
thanks!
--
davide
[1] https://elixir.bootlin.com/linux/v5.14/source/net/sched/sch_ets.c#L445
Powered by blists - more mailing lists