[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <0BC2C22C-F9AA-4B13-905D-FE32F41BDA8A@flyingcircus.io>
Date: Sat, 7 Oct 2023 06:10:42 +0200
From: Christian Theune <ct@...ingcircus.io>
To: Pedro Tammela <pctammela@...atatu.com>
Cc: stable@...r.kernel.org,
netdev@...r.kernel.org,
Linux regressions mailing list <regressions@...ts.linux.dev>,
davem@...emloft.net,
edumazet@...gle.com,
kuba@...nel.org,
pabeni@...hat.com
Subject: Re: [REGRESSION] Userland interface breaks due to hard HFSC_FSC
requirement
Hi,
> On 7. Oct 2023, at 00:12, Pedro Tammela <pctammela@...atatu.com> wrote:
>
> On 06/10/2023 05:37, Christian Theune wrote:
>> Hi,
>> (prefix, I was not aware of the regression reporting process and incorrectly reported this informally with the developers mentioned in the change)
>> I upgraded from 6.1.38 to 6.1.55 this morning and it broke my traffic shaping script, leaving me with a non-functional uplink on a remote router.
>> The script errors out like this:
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + ext=ispA
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + ext_ingress=ifb0
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + modprobe ifb
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + modprobe act_mirred
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + tc qdisc del dev ispA root
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2061]: Error: Cannot delete qdisc with handle of zero.
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + true
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + tc qdisc del dev ispA ingress
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2064]: Error: Cannot find specified qdisc on specified device.
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + true
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + tc qdisc del dev ifb0 root
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2066]: Error: Cannot delete qdisc with handle of zero.
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + true
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + tc qdisc del dev ifb0 ingress
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2067]: Error: Cannot find specified qdisc on specified device.
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + true
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + tc qdisc add dev ispA handle ffff: ingress
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + ifconfig ifb0 up
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + tc filter add dev ispA parent ffff: protocol all u32 match u32 0 0 action mirred egress redirect dev ifb0
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + tc qdisc add dev ifb0 root handle 1: hfsc default 1
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + tc class add dev ifb0 parent 1: classid 1:999 hfsc rt m2 2.5gbit
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2053]: + tc class add dev ifb0 parent 1:999 classid 1:1 hfsc sc rate 50mbit
>> Oct 06 05:49:22 wendy00 isp-setup-shaping-start[2077]: Error: Invalid parent - parent class must have FSC.
>> The error message is also a bit weird (but that’s likely due to iproute2 being weird) as the CLI interface for `tc` and the error message do not map well. (I think I would have to choose `hfsc sc` on the parent to enable the FSC option which isn’t mentioned anywhere in the hfsc manpage).
>> The breaking change was introduced in 6.1.53[1] and a multitude of other currently supported kernels:
>
> Hi,
>
> Your script is actually incorrect.
> `man 7 tc-hfsc` goes in depth into why, but I just wanna highlight this section:
> SEPARATE LS / RT SCs
> Another difference from the original HFSC paper is that RT and LS SCs can be specified separately. Moreover, leaf classes are
> allowed to have only either RT SC or LS SC. For interior classes, only LS SCs make sense: any RT SC will be ignored.
>
> The last part ("For interior classes...") was what the referenced patch fixed. We were mistakenly allowing RTs into "interior classes" which the implementation never accounted for and this was a source of crashes. I'm surprised you were lucky enough to never crash the kernel ;)
> -=
> I believe the script could be updated to the following and still achieve the same results:
> tc class add dev ifb0 parent 1: classid 1:999 hfsc ls m2 2.5gbit
> tc class add dev ifb0 parent 1:999 classid 1:1 hfsc rt rate 50mbit
I’m absolutely with you on this point and I’m very sure that I may have gotten the configuration wrong. I was already diving into fixing the script. Nevertheless, this end of the system is definitely not my strongest discipline and I tried get it right when setting this up initially scrambling through the man pages and the complexities of this topic. Unfortunately trying to learn this stuff is really hard and if the system finally accepts your config and does what you want and doesn’t complain in any way, then it’s hard to try and track down whether anything else could have been missed - somewhat a case of the halting problem. ;)
Point in case: I didn’t even notice tc-hfsc(7) was there and that I was only reading tc-hfsc(8).
Nevertheless, I’m curious and registered this as a regression, because it seems to fit the description and broke a running system. Looking at the change I’m wondering whether the kernel stability rule implies that the change has to be more forgiving towards user land, even if the user(land) might be wrong?
The idea of not bricking your system by upgrading the Linux kernel seems to apply here. IMHO the change could maybe done in a way that keeps the system running but informs the user that something isn’t working as intended?
Liebe Grüße,
Christian Theune
--
Christian Theune · ct@...ingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
Powered by blists - more mailing lists