[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <828b8af7-8b4e-4820-bf78-41a8b7af16ce@mojatatu.com>
Date: Fri, 5 Jan 2024 08:51:04 -0300
From: Pedro Tammela <pctammela@...atatu.com>
To: Jiri Pirko <jiri@...nulli.us>, Jamal Hadi Salim <jhs@...atatu.com>
Cc: netdev@...r.kernel.org, kuba@...nel.org, pabeni@...hat.com,
davem@...emloft.net, edumazet@...gle.com, xiyou.wangcong@...il.com,
victor@...atatu.com, idosch@...sch.org, mleitner@...hat.com,
vladbu@...dia.com, paulb@...dia.com
Subject: Re: [patch net-next] net: sched: move block device tracking into
tcf_block_get/put_ext()
On 05/01/2024 08:24, Jiri Pirko wrote:
> Thu, Jan 04, 2024 at 07:22:48PM CET, jhs@...atatu.com wrote:
>> On Thu, Jan 4, 2024 at 1:03 PM Jiri Pirko <jiri@...nulli.us> wrote:
>>>
>>> Thu, Jan 04, 2024 at 05:10:58PM CET, jhs@...atatu.com wrote:
>>>> On Thu, Jan 4, 2024 at 7:58 AM Jiri Pirko <jiri@...nulli.us> wrote:
>>>>>
>>>>> From: Jiri Pirko <jiri@...dia.com>
>>>>>
>>>>> Inserting the device to block xarray in qdisc_create() is not suitable
>>>>> place to do this. As it requires use of tcf_block() callback, it causes
>>>>> multiple issues. It is called for all qdisc types, which is incorrect.
>>>>>
>>>>> So, instead, move it to more suitable place, which is tcf_block_get_ext()
>>>>> and make sure it is only done for qdiscs that use block infrastructure
>>>>> and also only for blocks which are shared.
>>>>>
>>>>> Symmetrically, alter the cleanup path, move the xarray entry removal
>>>>> into tcf_block_put_ext().
>>>>>
>>>>> Fixes: 913b47d3424e ("net/sched: Introduce tc block netdev tracking infra")
>>>>> Reported-by: Ido Schimmel <idosch@...dia.com>
>>>>> Closes: https://lore.kernel.org/all/ZY1hBb8GFwycfgvd@shredder/
>>>>> Reported-by: Kui-Feng Lee <sinquersw@...il.com>
>>>>> Closes: https://lore.kernel.org/all/ce8d3e55-b8bc-409c-ace9-5cf1c4f7c88e@gmail.com/
>>>>> Reported-and-tested-by: syzbot+84339b9e7330daae4d66@...kaller.appspotmail.com
>>>>> Closes: https://lore.kernel.org/all/0000000000007c85f5060dcc3a28@google.com/
>>>>> Reported-and-tested-by: syzbot+806b0572c8d06b66b234@...kaller.appspotmail.com
>>>>> Closes: https://lore.kernel.org/all/00000000000082f2f2060dcc3a92@google.com/
>>>>> Reported-and-tested-by: syzbot+0039110f932d438130f9@...kaller.appspotmail.com
>>>>> Closes: https://lore.kernel.org/all/0000000000007fbc8c060dcc3a5c@google.com/
>>>>> Signed-off-by: Jiri Pirko <jiri@...dia.com>
>>>>
>>>> Did you get a chance to run the tdc tests?
>>>
>>> I ran the TC ones we have in the net/forwarding directory.
>>> I didn't manage to run the tdc. Readme didn't help me much.
>>> How do you run the suite?
>>
>> For next time:
>> make -C tools/testing/selftests TARGETS=tc-testing run_tests
>
> Unrelated to this patch.
>
> Running this, I'm getting lots of errors, some seem might be bugs in
> tests. Here's the output:
>
> make: Entering directory '/mnt/share156/jiri/net-next/tools/testing/selftests'
> make[1]: Entering directory '/mnt/share156/jiri/net-next/tools/testing/selftests/tc-testing'
> make[1]: Nothing to be done for 'all'.
> make[1]: Leaving directory '/mnt/share156/jiri/net-next/tools/testing/selftests/tc-testing'
> make[1]: Entering directory '/mnt/share156/jiri/net-next/tools/testing/selftests/tc-testing'
> TAP version 13
> 1..1
> # timeout set to 900
> # selftests: tc-testing: tdc.sh
> # netdevsim
> # act_bpf
> # act_connmark
> # act_csum
> # act_ct
> # act_ctinfo
> # act_gact
> # act_gate
> # act_mirred
> # act_mpls
> # act_nat
> # act_pedit
> # act_police
> # act_sample
> # act_simple
> # act_skbedit
> # act_skbmod
> # act_tunnel_key
> # act_vlan
> # cls_basic
> # cls_bpf
> # cls_cgroup
> # cls_flow
> # cls_flower
> # cls_fw
> # cls_matchall
> # cls_route
> # cls_u32
> # Module em_canid not found... skipping.
> # em_cmp
> # em_ipset
> # em_ipt
> # em_meta
> # em_nbyte
> # em_text
> # em_u32
> # sch_cake
> # sch_cbs
> # sch_choke
> # sch_codel
> # sch_drr
> # sch_etf
> # sch_ets
> # sch_fq
> # sch_fq_codel
> # sch_fq_pie
> # sch_gred
> # sch_hfsc
> # sch_hhf
> # sch_htb
> # sch_teql
> # considering category actions
> # !!! Consider installing pyroute2 !!!
> # -- ns/SubPlugin.__init__
> # -- scapy/SubPlugin.__init__
> # Executing 528 tests in parallel and 15 in serial
> # Using 18 batches and 4 workers
> #
Hi Jiri,
Can you also try running after installing pyroute2?
`pip3 install pyroute2` or via your distro's package manager.
Seems like in your kernel config + (virtual?) machine you are running
into issues like:
- create netns
- try to use netns (still "creating", not visible yet)
- fail badly
Since pyroute2 is netlink based, these issues are more likely to not happen.
Yes I agree the error messages are cryptic, we have been wondering how
to improve them.
If you are still running into issues after trying with pyroute2, can you
also try running tdc with the following diff? It disables the parallel
testing for kselftests.
Perhaps when pyroute2 is not detected on the system we should disable
the parallel testing completely. tc/ip commands are too susceptible to
these issues and parallel testing just amplifies them.
diff --git a/tools/testing/selftests/tc-testing/tdc.sh
b/tools/testing/selftests/tc-testing/tdc.sh
index c53ede8b730d..89bb123db86b 100755
--- a/tools/testing/selftests/tc-testing/tdc.sh
+++ b/tools/testing/selftests/tc-testing/tdc.sh
@@ -63,5 +63,5 @@ try_modprobe sch_hfsc
try_modprobe sch_hhf
try_modprobe sch_htb
try_modprobe sch_teql
-./tdc.py -J`nproc` -c actions
-./tdc.py -J`nproc` -c qdisc
+./tdc.py -J1 -c actions
+./tdc.py -J1 -c qdisc
Powered by blists - more mailing lists