[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <66d03368-9b8e-b953-a3a5-1f61b71e6307@mojatatu.com>
Date: Sun, 3 May 2020 08:02:45 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
Jiri Pirko <jiri@...nulli.us>
Subject: Re: [Patch net v2] net_sched: fix tcm_parent in tc filter dump
On 2020-05-02 10:28 p.m., Cong Wang wrote:
> On Sat, May 2, 2020 at 2:19 AM Jamal Hadi Salim <jhs@...atatu.com> wrote:
>>
>> On 2020-05-02 4:48 a.m., Jamal Hadi Salim wrote:
[..]
>>> Note:
>>> tc filter show dev dummy0 root
>>> should not show that filter. OTOH,
>>> tc filter show dev dummy0 parent ffff:
>>> should.
>
> Hmm, but we use TC_H_MAJ(tcm->tcm_parent) to do the
> lookup, 'root' (ffff:ffff) has the same MAJ with ingress
> (ffff:0000).
>
I have some long analysis and theory below.
> And qdisc_lookup() started to search for ingress since 2008,
> commit 8123b421e8ed944671d7241323ed3198cccb4041.
>
> So it is likely too late to change this behavior even if it is not
> what we prefer.
>
My gut feeling is that whatever broke (likely during block addition
maybe even earlier during clsact addition) is in the code
path for adding filter. Dumping may have bugs but i would
point a finger to filter addition first.
More below.... (sorry long email).
Here's what i tried after applying your patch:
----
# $TC qd add dev $DEV ingress
# $TC qd add dev $DEV root prio
# $TC qd ls dev $DEV
qdisc noqueue 0: dev lo root refcnt 2
qdisc prio 8008: dev enp0s1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0
0 1 1 1 1 1 1 1 1
qdisc ingress ffff: dev enp0s1 parent ffff:fff1 ----------------
-----
egress i.e root is at 8008:
ingress is at ffff:fff1
If say:
---
# $TC filter add dev $DEV root protocol arp prio 10 basic action pass
----
i am instructing the kernel to "go and find root (which is 8008:)
and install the filter there".
IOW, I could install that filter alternatively as:
----
# $TC filter add dev $DEV parent 8008: protocol arp prio 11 basic action
pass
---
Basically these two filters are equivalent and should end in the
same qdisc.
To test, I added those two filters (the prio is useful to visualize
in the dump).
Lets see the dump:
-------
# $TC filter show dev $DEV root
filter protocol arp pref 10 basic chain 0
filter protocol arp pref 10 basic chain 0 handle 0x1
action order 1: gact action pass
random type none pass val 0
index 1 ref 1 bind 1
---------
I was hoping i'd see both filters, but alas there's only
one.
Lets try a different dump explicitly specifying root qdisc id:
---------
# $TC filter show dev $DEV parent 8008:
filter protocol arp pref 11 basic chain 0
filter protocol arp pref 11 basic chain 0 handle 0x1
action order 1: gact action pass
random type none pass val 0
index 2 ref 1 bind 1
---------
Again i was hoping to see both filters.
This sounds like the two filters are anchored
at two different qdiscs instead of the same one
(i.e root). Hence my suspicion...
Lets add a filter at ingress:
-----
$TC filter add dev $DEV parent ffff:fff1 protocol arp basic action pass
-------
Ok lets dump this from ingress:
-----
# $TC filter show dev $DEV parent ffff:fff1
filter protocol arp pref 10 basic chain 0
filter protocol arp pref 10 basic chain 0 handle 0x1
action order 1: gact action pass
random type none pass val 0
index 1 ref 1 bind 1
filter protocol arp pref 49152 basic chain 0
filter protocol arp pref 49152 basic chain 0 handle 0x1
action order 1: gact action pass
random type none pass val 0
index 3 ref 1 bind 1
-------------
Same result if i said "root".
I was only expecting to see the one with pref 49152 in
the above output.
It _feels_ like those two filters(ingress and egress) are
installed in the same struct.
Ok, last dump without specifying a parent, which should
pick root qdisc per code:
------
# $TC filter show dev $DEV
filter parent 8008: protocol arp pref 11 basic chain 0
filter parent 8008: protocol arp pref 11 basic chain 0 handle 0x1
action order 1: gact action pass
random type none pass val 0
index 2 ref 1 bind 1
----
Semantically this should have dumped all 3 filters and
not just pick root. But that issue has been there
for a decade like you said. So it is reasonable to specify
parent ffff:fff1 for dumping just ingress side.
Also reasonable not to see ingress when dumping root.
> If parent is not specified, only egress will be shown, as
> we just assign q = dev->qdisc.
>
Which is the root egress qdisc.
That is the "$TC filter show dev $DEV" scenario.
See my comment above.
> I agree, 'root' should mean the root qdisc on egress, matching
> ingress with 'root' doesn't make much sense to me either.
>
> But I am afraid it is too late to change ,if this behavior has been
> there for 12+ years.
>
Although i cant pinpoint exactly when - this used to work (I dont think
its 12+ years but I could be wrong). These semantics are really broken.
Do you have time to look at the theory that things break at install?
If you dont have time i could try to debug it by Tuesday.
cheers,
jamal
Powered by blists - more mailing lists