[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170711123412.GB1874@nanopsycho>
Date: Tue, 11 Jul 2017 14:34:12 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net,
xiyou.wangcong@...il.com, edumazet@...gle.com,
stephen@...workplumber.org, jbenc@...hat.com, mlxsw@...lanox.com,
andrew@...n.ch, vivien.didelot@...oirfairelinux.com,
f.fainelli@...il.com, john.fastabend@...il.com,
alexander.h.duyck@...el.com, daniel@...earbox.net,
ogerlitz@...lanox.com, mrv@...atatu.com
Subject: Re: [patch net-next RFC 0/4] net: sched: allow qdiscs to share
filter block instances
Tue, Jul 11, 2017 at 02:15:27PM CEST, jhs@...atatu.com wrote:
>Hi Jiri,
>
>Commenting on generalities - will comment on code later:
>
>On 17-07-10 02:51 PM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@...lanox.com>
>>
>> Currently the filters added to qdiscs are independent. So for example if you
>> have 2 netdevices and you create ingress qdisc on both and you want to add
>> identical filter rules both, you need to add them twice. This patchset
>> makes this easier and mainly saves resources allowing to share all filters
>> within a qdisc - I call it a "filter block". Also this helps to save
>> resources when we do offload to hw for example to expensive TCAM.
>>
>> So back to the example. First, we create 2 qdiscs. Both will share
>> block number 22. "22" is just an identification. If we don't pass any
>> block number, a new one will be generated by kernel:
>>
>> $ tc qdisc add dev ens7 ingress block 22
>> ^^^^^^^^
>> $ tc qdisc add dev ens8 ingress block 22
>>
>
>Above makes intuitive sense.
>
>
> ^^^^^^^^
>>
>> Now if we list the qdiscs, we will see the block index in the output:
>> qdisc fq_codel 0: dev ens7 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
>> Sent 9014 bytes 99 pkt (dropped 0, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>> maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>> new_flows_len 0 old_flows_len 0
>> qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
>> ^^^^^^^^
>> Sent 4592 bytes 58 pkt (dropped 0, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>> qdisc fq_codel 0: dev ens8 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
>> Sent 17022 bytes 307 pkt (dropped 0, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>> maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>> new_flows_len 0 old_flows_len 0
>> qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22
>> ^^^^^^^^
>> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>>
>
>So does this.
>
>>
>> Now we can add filter to any of qdiscs sharing the same block:
>>
>> $ tc filter add dev ens7 parent ffff: protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop
>>
>
>So for backward compat - this also makes sense. But:
>it does make sense to create new syntax for adding
>filters and actions:
>
>tc filter add block 22 protocol ip pref 25 flower \
> dst_ip 192.168.0.0/16 action drop
Was thinking about that. Decided to pass on this now. This should be
addressed by follow-up anyway.
>
>Coordinates of the filter block before were:
>
><ifindex>, <parent>, [handle]
>
>You should be able to abuse struct tcmsg ifindex to represent block #
>as long as you set parent to be something meaningful that is
>identified "block coordinate" via TC_H_XXX (pick something safe not
>in use by ingress or egress; look at: uapi/linux/pkt_sched.h)
Not sure about this. I have take closer look. In general, I don't like
to abuse anything :)
>
>>
>> We will see the same output if we list filters for ens7 and ens8, including stats:
>>
>> $ tc -s filter show dev ens7 root
>> filter parent ffff: protocol ip pref 25 flower
>> filter parent ffff: protocol ip pref 25 flower handle 0x1
>> eth_type ipv4
>> dst_ip 192.168.1.0/24
>> action order 1: gact action drop
>> random type none pass val 0
>> index 3 ref 1 bind 1 installed 10201 sec used 10150 sec
>> Action statistics:
>> Sent 4200 bytes 50 pkt (dropped 50, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>>
>> $ tc -s filter show dev ens8 root
>> filter dev ens7 parent ffff: protocol ip pref 25 flower
>> filter dev ens7 parent ffff: protocol ip pref 25 flower handle 0x1
>> eth_type ipv4
>> dst_ip 192.168.1.0/24
>> action order 1: gact action drop
>> random type none pass val 0
>> index 3 ref 1 bind 1 installed 10202 sec used 10152 sec
>> Action statistics:
>> Sent 4200 bytes 50 pkt (dropped 50, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>>
>>
>> Issues:
>> - tp->q is set by the device used to add the filter. That has to be resolved.
>> Impacts the dump (as you can see above)
>>
>
>I think you have more problems if the dump above is reality;->
>You added to ingress and this is showing egress.
Howcome? I only don't see "dev x" on ens7. That is the only difference,
>
>To complete the thought, dump is:
>
> tc -s filter show block 22
Understood. Again, this should be addressed in follow-up.
>
>cheers,
>jamal
>
Powered by blists - more mailing lists