[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180104130044.GC2213@nanopsycho>
Date: Thu, 4 Jan 2018 14:00:44 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: Jakub Kicinski <kubakici@...pl>, David Ahern <dsahern@...il.com>,
netdev@...r.kernel.org, davem@...emloft.net,
xiyou.wangcong@...il.com, mlxsw@...lanox.com, andrew@...n.ch,
vivien.didelot@...oirfairelinux.com, f.fainelli@...il.com,
michael.chan@...adcom.com, ganeshgr@...lsio.com,
saeedm@...lanox.com, matanb@...lanox.com, leonro@...lanox.com,
idosch@...lanox.com, simon.horman@...ronome.com,
pieter.jansenvanvuuren@...ronome.com, john.hurley@...ronome.com,
alexander.h.duyck@...el.com, ogerlitz@...lanox.com,
john.fastabend@...il.com, daniel@...earbox.net
Subject: Re: [patch net-next v4 00/10] net: sched: allow qdiscs to share
filter block instances
Thu, Jan 04, 2018 at 01:41:54PM CET, jhs@...atatu.com wrote:
>
>
>My comments below - I do have a more _basic_ issue on the egress
>side that IMO noone has brought up that needs to be addressed.
>All examples so far have been focusing on ingress.
Which one? I guess you will send it as a separate email.
>
>Since I am late in this discussion, let me state my understanding
>of the thread and assumptions so there is no cross-talk.
>
>we are going to need to be able to say:
>tc qdisc add block 1
>tc filter add block 1 protocol ip priority 10 flower ...
>
>>From my scanning of the thread, I dont think I see disagreement.
>Jiri, did i misread what you are saying?
Agreed. The plan is to do this in the follow-up, but if you guys need it
now, I can do that now.
>
>New approach syntax should still be legit for backward
>compat, so this should work:
>
>"tc filter add dev ens7 parent ffff: protocol ip priority 10 flower .."
Agreed. Works like that with the current patchset.
>
>Blocks or not we still need to have old scripts work.
>Which also means one should also be able to say:
>"tc filter ls dev ens7 ..." with no suprises.
Agreed. Works like that with the current patchset.
>
>>From my reading of the thread - there is no disagreement that
>"tc qdisc add dev ens7 ingress block 1"
>should be able to "bind" to block1 if it exists
>or create it if it doesnt and then "bind" to it ("autobind").
Agreed. Works exactly as you described in the current patchset.
>
>And that a simple "tc qdisc add dev ens7 ingress" should
>either not be able to create a block (or we can say creates
>block id 0 - owned by the netdev - for consistency).
This allocates a new block - as it needs it internally anyway - and tc
core will assign unused block id. You can see this id in the list then.
Later you can use this block id for other created qdiscs.
Block ids are per-ns.
>
>To David A: If someone created the qdisc using new approach
>and then later tried to add/list rules using the old syntax, surely
>that should work, no? I have seen setups which separate
>creation of qdiscs using one script and addition of filters via
>another script (very common actually).
Agreed, that is exactly what I want from day 1.
>
>Having said that:
>I see the value of less is more and being able to use one(block)
>interface to add/del/list filters if we add the blocks feature;
>but backward compat(meaning crazy scripting) needs to be respected.
>I dont see how we get away from that.
>
>Maybe what we need is another knob to control this new functionality?
>Either a kernel config option or an extra parameter when creating
>the qdisc or a system wide boolean per netdev (which of course
>contradicts the "less is more" principle). If this knob was set
>then you reject addition of filters via a port/qdisc which is sharing.
I don't like this knob idea. It just adds confusion for no good reason
what so ever.
>
>I agree with Jiri - if you consciously choose to share there should
>be no suprises with what filters get applied.
>
>I think this email is too long and my egress concern will be lost
>if i typed it here; so i will send another email.
>
>cheers,
>jamal
>
>On 18-01-04 05:12 AM, Jiri Pirko wrote:
>> Thu, Jan 04, 2018 at 08:06:58AM CET, kubakici@...pl wrote:
>> > On Thu, 4 Jan 2018 07:57:02 +0100, Jiri Pirko wrote:
>> > > Thu, Jan 04, 2018 at 12:51:52AM CET, kubakici@...pl wrote:
>> > > > On Wed, 3 Jan 2018 18:22:09 +0100, Jiri Pirko wrote:
>> > > > > However I don't agree about breaking the existing filter add and show
>> > > > > and also imposibility to make not-shared block shared in the runtime
>> > > > > before defining it first.
>> > > >
>> > > > FWIW I would agree with David that allowing add on a shared block
>> > > > modify filters on another interface can break existing users. (No
>> > > > opinion on dump and lifetime).
>> > >
>> > > I don't think that David is saying that, but why do you think it would
>> > > break existing users?
>> >
>> > Perhaps I worded is too strongly as "breaking existing users", but it
>> > certainly introduces surprising side effects. David put it into words
>> > very well:
>> >
>> > On Tue, 2 Jan 2018 19:07:36 -0700, David Ahern wrote:
>> > > The disagreement is in how they should be managed. I think my last
>> > > response concisely captures my concerns -- the principle of least surprise.
>> > >
>> > > So with the initial commands above, all is fine. Then someone is
>> > > debugging a problem or wants to add another filter to ens8, so they run:
>> > >
>> > > $ tc filter add dev ens8 ingress protocol ip pref 25 flower dst_ip
>> > > 192.168.1.0/16 action drop
>> > >
>> > > Then traffic flows through ens7 break and some other user is struggling
>> > > to understand what just happened. That the new filter magically appears
>> > > on ens7 when the user operated on ens8 is a surprise. Nothing about that
>> > > last command acknowledges that it is changing a shared resource.
>>
>> It is not that the new filter magically appears on ens7. No magic. ens8
>> and ens7 share the same block. User set the sharing up originally,
>> he can see it in "tc qdisc show". So if he does this, he cannot be
>> surprised. Also, he can list the filters if he does "tc filter show"
>> to see what is there per-qdisc.
>>
>> If the user expects something else, he should go back and read the docs
>> and learn to understand how tc works.
>>
>
>
>
>
Powered by blists - more mailing lists