[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a369ed0b-943e-8de5-10f3-3f820a3f8225@nvidia.com>
Date: Mon, 21 Nov 2022 10:08:34 +0200
From: Gal Pressman <gal@...dia.com>
To: Максим <maxtram95@...il.com>
Cc: Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Maxim Mikityanskiy <maximmi@...dia.com>,
"David S . Miller" <davem@...emloft.net>,
Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
Cong Wang <xiyou.wangcong@...il.com>, eric.dumazet@...il.com,
syzbot <syzkaller@...glegroups.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Tariq Toukan <tariqt@...dia.com>
Subject: Re: [PATCH net] net: sched: fix race condition in qdisc_graft()
On 20/11/2022 23:39, Максим wrote:
> On Sun, 20 Nov 2022 at 19:01, Eric Dumazet <edumazet@...gle.com> wrote:
>> On Sun, Nov 20, 2022 at 8:43 AM Gal Pressman <gal@...dia.com> wrote:
>>> On 20/11/2022 18:09, Eric Dumazet wrote:
>>>> On Sat, Nov 19, 2022 at 11:42 PM Gal Pressman <gal@...dia.com> wrote:
>>>>> On 10/11/2022 11:08, Gal Pressman wrote:
>>>>>> On 06/11/2022 10:07, Gal Pressman wrote:
>>>>>>> It reproduces consistently:
>>>>>>> ip link set dev eth2 up
>>>>>>> ip addr add 194.237.173.123/16 dev eth2
>>>>>>> tc qdisc add dev eth2 clsact
>>>>>>> tc qdisc add dev eth2 root handle 1: htb default 1 offload
>>>>>>> tc class add dev eth2 classid 1: parent root htb rate 18000mbit ceil
>>>>>>> 22500.0mbit burst 450000kbit cburst 450000kbit
>>>>>>> tc class add dev eth2 classid 1:3 parent 1: htb rate 3596mbit burst
>>>>>>> 89900kbit cburst 89900kbit
>>>>>>> tc qdisc delete dev eth2 clsact
>>>>>>> tc qdisc delete dev eth2 root handle 1: htb default 1
>>>>>>>
>>>>>>> Please let me know if there's anything else you want me to check.
>>>>>> Hi Eric, did you get a chance to take a look?
>>>>> No response for quite a long time, Jakub, should I submit a revert?
>>>> Sorry, I won't have time to look at this before maybe two weeks.
>>> Thanks for the response, Eric.
>>>
>>>> If you want to revert a patch which is correct, because some code
>>>> assumes something wrong,
>>> I am not convinced about the "code assumes something wrong" part, and
>>> not sure what are the consequences of this WARN being triggered, are you?
>>>
>>>> I will simply say this seems not good.
>>> Arguable, it is not that clear that a fix that introduces another issue
>>> is a good thing, particularly when we don't understand the severity of
>>> the thing that got broken.
>> The offload part has been put while assuming a certain (clearly wrong) behavior.
>>
>> RCU rules are quite the first thing we need to respect in the kernel.
>>
>> Simply put, when KASAN detects a bug, you can be pretty damn sure it
>> is a real one.
>>
>>> Two weeks gets us to the end of -rc7, a bit too dangerous to my personal
>>> taste, but I'm not the one making the calls.
>> Agreed, please try to find someone at nvidia able to understand what Maxim
>> was doing in commit ca49bfd90a9dde175d2929dc1544b54841e33804
> The check for TCQ_F_BUILTIN basically means checking for noop_qdisc.
> As the comment above the WARN_ON says, my code expects that before the
> root HTB qdisc is destroyed (the notify_and_destroy line from Eric's
> patch, in qdisc_graft), the kernel assigns &noop_qdisc to all of
> dev_queue->qdisc-s (i.e. qdiscs of HTB classes). IIRC, it should
> happen in dev_deactivate in qdisc_graft (and if IFF_UP is unset at
> this point, that means dev_deactivate has been called before, from
> __dev_close_many). That said, I don't see how Eric's patch could
> affect this logic. What has changed in Eric's patch is that the old
> root qdisc (HTB) is now destroyed after assigning the new qdisc to
> dev->qdisc, but it has nothing to do with dev_queue->qdisc-s.
>
> Gal, are you sure the WARN_ON only happens after Eric's patch? If yes,
> could you do some tracing and find out whether all the qdiscs of the
> netdev queues are noop_qdisc after the dev_deactivate line in
> qdisc_graft — before and after Eric's patch?
>
> for (i = 0; i < num_q; i++) {
> struct netdev_queue *dev_queue = netdev_get_tx_queue(dev, i);
> if (dev_queue->qdisc == &noop_qdisc)
> ...
> }
Hey Maxim, thanks for taking a look into this.
Yes, the WARN_ON is not triggered before Eric's patch.
The behavior of the code you mentioned is the same before and after his
patch though.
In qdisc_graft():
* Before dev_deactivate(), qdiscs 0-5 are != noop_qdisc, qdiscs 6-303
are == noop_qdisc
* After dev_deactivate(), all qdiscs are == noop_qdisc.
Powered by blists - more mailing lists