netdev - Re: iproute2: tc deletion freezes whole server

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANxWus9RgiVP1X4zK5mVG4ELQmL2ckk4AYMvTdKse6j5WtHNHg@mail.gmail.com>
Date:   Fri, 8 May 2020 15:59:22 +0200
From:   Václav Zindulka <vaclav.zindulka@...pnet.cz>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: iproute2: tc deletion freezes whole server

On Thu, May 7, 2020 at 8:52 PM Cong Wang <xiyou.wangcong@...il.com> wrote:
>
> On Tue, May 5, 2020 at 1:46 AM Václav Zindulka
> <vaclav.zindulka@...pnet.cz> wrote:
> >
> > On Mon, May 4, 2020 at 7:46 PM Cong Wang <xiyou.wangcong@...il.com> wrote:
> > >
> > > Sorry for the delay. I lost connection to my dev machine, I am trying
> > > to setup this on my own laptop.
> >
> > Sorry to hear that. I will gladly give you access to my testing
> > machine where all this nasty stuff happens every time so you can test
> > it in place. You can try everything there and have online results. I
> > can give you access even to the IPMI console so you can switch the
> > kernel during boot easily. I didn't notice this problem until the time
> > of deployment. My prior testing machines were with metallic ethernet
> > ports only so I didn't know about those problems earlier.
>
> Thanks for the offer! No worries, I setup a testing VM on my laptop.

OK

> > >
> > > I tried to emulate your test case in my VM, here is the script I use:
> > >
> > > ====
> > > ip li set dev dummy0 up
> > > tc qd add dev dummy0 root handle 1: htb default 1
> > > for i in `seq 1 1000`
> > > do
> > >   tc class add dev dummy0 parent 1:0 classid 1:$i htb rate 1mbit ceil 1.5mbit
> > >   tc qd add dev dummy0 parent 1:$i fq_codel
> > > done
> > >
> > > time tc qd del dev dummy0 root
> > > ====
> > >
> > > And this is the result:
> > >
> > >     Before my patch:
> > >      real   0m0.488s
> > >      user   0m0.000s
> > >      sys    0m0.325s
> > >
> > >     After my patch:
> > >      real   0m0.180s
> > >      user   0m0.000s
> > >      sys    0m0.132s
> >
> > My results with your test script.
> >
> > before patch:
> > /usr/bin/time -p tc qdisc del dev enp1s0f0 root
> > real 1.63
> > user 0.00
> > sys 1.63
> >
> > after patch:
> > /usr/bin/time -p tc qdisc del dev enp1s0f0 root
> > real 1.55
> > user 0.00
> > sys 1.54
> >
> > > This is an obvious improvement, so I have no idea why you didn't
> > > catch any difference.
> >
> > We use hfsc instead of htb. I don't know whether it may cause any
> > difference. I can provide you with my test scripts if necessary.
>
> Yeah, you can try to replace the htb with hfsc in my script,
> I didn't spend time to figure out hfsc parameters.

class add dev dummy0 parent 1:0 classid 1:$i hfsc ls m1 0 d 0 m2
13107200 ul m1 0 d 0 m2 13107200

but it behaves the same as htb...

> My point here is, if I can see the difference with merely 1000
> tc classes, you should see a bigger difference with hundreds
> of thousands classes in your setup. So, I don't know why you
> saw a relatively smaller difference.

I saw a relatively big difference. It was about 1.5s faster on my huge
setup which is a lot. Yet maybe the problem is caused by something
else? I thought about tx/rx queues. RJ45 ports have up to 4 tx and rx
queues. SFP+ interfaces have much higher limits. 8 or even 64 possible
queues. I've tried to increase the number of queues using ethtool from
4 to 8 and decreased to 2. But there was no difference. It was about
1.62 - 1.63 with an unpatched kernel and about 1.55 - 1.58 with your
patches applied. I've tried it for ifb and RJ45 interfaces where it
took about 0.02 - 0.03 with an unpatched kernel and 0.05 with your
patches applied, which is strange, but it may be caused by the fact it
was very fast even before.

I've commits c71c00df335f6aff00d3dc7f28e06dc8abc088a7,
13a5aec17cc65f6aa5c3bc470f504650bd465a69,
720cc6b0d12fb7c8a494e441ebd360c62023dad2,
51287a4bc6f2addd4a8c1919829aab3bb7c706c9 from
https://github.com/congwang/linux/commits/qdisc_reset applied on 5.4.6
kernel. I can apply them on the newest one if it can have any impact.
I hope I've applied the right patches and haven't missed any older
commits.

I've even tried to compile the kernel from your repository - branch
qdisc_reset. Times are a little bit lower than with patched 5.4.6.
1.52 - 1.53. Yet I still can't get to great improvement like you saw.

Thank you.