[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJSsFPBp5dYm3y6Jbbpuwbb9P+X3gmqk6zow0VWgx1Q-A@mail.gmail.com>
Date: Sun, 20 Nov 2022 09:00:54 -0800
From: Eric Dumazet <edumazet@...gle.com>
To: Gal Pressman <gal@...dia.com>
Cc: Jakub Kicinski <kuba@...nel.org>,
Maxim Mikityanskiy <maximmi@...dia.com>,
"David S . Miller" <davem@...emloft.net>,
Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
Cong Wang <xiyou.wangcong@...il.com>, eric.dumazet@...il.com,
syzbot <syzkaller@...glegroups.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Tariq Toukan <tariqt@...dia.com>,
Maxim Mikityanskiy <maxtram95@...il.com>
Subject: Re: [PATCH net] net: sched: fix race condition in qdisc_graft()
On Sun, Nov 20, 2022 at 8:43 AM Gal Pressman <gal@...dia.com> wrote:
>
> On 20/11/2022 18:09, Eric Dumazet wrote:
> > On Sat, Nov 19, 2022 at 11:42 PM Gal Pressman <gal@...dia.com> wrote:
> >> On 10/11/2022 11:08, Gal Pressman wrote:
> >>> On 06/11/2022 10:07, Gal Pressman wrote:
> >>>> It reproduces consistently:
> >>>> ip link set dev eth2 up
> >>>> ip addr add 194.237.173.123/16 dev eth2
> >>>> tc qdisc add dev eth2 clsact
> >>>> tc qdisc add dev eth2 root handle 1: htb default 1 offload
> >>>> tc class add dev eth2 classid 1: parent root htb rate 18000mbit ceil
> >>>> 22500.0mbit burst 450000kbit cburst 450000kbit
> >>>> tc class add dev eth2 classid 1:3 parent 1: htb rate 3596mbit burst
> >>>> 89900kbit cburst 89900kbit
> >>>> tc qdisc delete dev eth2 clsact
> >>>> tc qdisc delete dev eth2 root handle 1: htb default 1
> >>>>
> >>>> Please let me know if there's anything else you want me to check.
> >>> Hi Eric, did you get a chance to take a look?
> >> No response for quite a long time, Jakub, should I submit a revert?
> > Sorry, I won't have time to look at this before maybe two weeks.
>
> Thanks for the response, Eric.
>
> > If you want to revert a patch which is correct, because some code
> > assumes something wrong,
>
> I am not convinced about the "code assumes something wrong" part, and
> not sure what are the consequences of this WARN being triggered, are you?
>
> > I will simply say this seems not good.
>
> Arguable, it is not that clear that a fix that introduces another issue
> is a good thing, particularly when we don't understand the severity of
> the thing that got broken.
The offload part has been put while assuming a certain (clearly wrong) behavior.
RCU rules are quite the first thing we need to respect in the kernel.
Simply put, when KASAN detects a bug, you can be pretty damn sure it
is a real one.
>
> Two weeks gets us to the end of -rc7, a bit too dangerous to my personal
> taste, but I'm not the one making the calls.
Agreed, please try to find someone at nvidia able to understand what Maxim
was doing in commit ca49bfd90a9dde175d2929dc1544b54841e33804
If something needs stronger rules than standard RCU ones, this should
be articulated.
As I said, I won't be able to work on this before ~2 weeks.
Powered by blists - more mailing lists