[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141126114035.GM1875@nanopsycho.orion>
Date: Wed, 26 Nov 2014 12:40:35 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: Scott Feldman <sfeldma@...il.com>,
John Fastabend <john.r.fastabend@...el.com>,
Netdev <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
"nhorman@...driver.com" <nhorman@...driver.com>,
Andy Gospodarek <andy@...yhouse.net>,
Thomas Graf <tgraf@...g.ch>,
"dborkman@...hat.com" <dborkman@...hat.com>,
"ogerlitz@...lanox.com" <ogerlitz@...lanox.com>,
"jesse@...ira.com" <jesse@...ira.com>,
"pshelar@...ira.com" <pshelar@...ira.com>,
"azhou@...ira.com" <azhou@...ira.com>,
"ben@...adent.org.uk" <ben@...adent.org.uk>,
"stephen@...workplumber.org" <stephen@...workplumber.org>,
"Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>,
"vyasevic@...hat.com" <vyasevic@...hat.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Eric Dumazet <edumazet@...gle.com>,
Florian Fainelli <f.fainelli@...il.com>,
Roopa Prabhu <roopa@...ulusnetworks.com>,
John Linville <linville@...driver.com>,
"jasowang@...hat.com" <jasowang@...hat.com>,
"ebiederm@...ssion.com" <ebiederm@...ssion.com>,
Nicolas Dichtel <nicolas.dichtel@...nd.com>,
"ryazanov.s.a@...il.com" <ryazanov.s.a@...il.com>,
"buytenh@...tstofly.org" <buytenh@...tstofly.org>,
Aviad Raveh <aviadr@...lanox.com>,
"nbd@...nwrt.org" <nbd@...nwrt.org>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Neil Jerram <Neil.Jerram@...aswitch.com>,
"ronye@...lanox.com" <ronye@...lanox.com>,
"simon.horman@...ronome.com" <simon.horman@...ronome.com>,
"alexander.h.duyck@...hat.com" <alexander.h.duyck@...hat.com>,
"Ronciak, John" <john.ronciak@...el.com>,
"mleitner@...hat.com" <mleitner@...hat.com>,
Shrijeet Mukherjee <shrijeet@...il.com>,
Andy Gospodarek <gospo@...ulusnetworks.com>,
Benjamin LaHaise <bcrl@...ck.org>
Subject: Re: [patch net-next v3 02/17] net: make vid as a parameter for
ndo_fdb_add/ndo_fdb_del
Wed, Nov 26, 2014 at 12:28:18PM CET, jhs@...atatu.com wrote:
>On 11/25/14 22:59, Scott Feldman wrote:
>>On Tue, Nov 25, 2014 at 5:19 PM, Jamal Hadi Salim <jhs@...atatu.com> wrote:
>>>On 11/25/14 21:36, Scott Feldman wrote:
>
>>>
>>>>>Ok, guess i am gonna have to go stare at the code some more.
>>>>>I thought we returned one of the error codes?
>>>>>A bitmask would work for a single entry - because you have two
>>>>>options add to h/ware and/or s/ware. So response is easy to encode.
>>>>>But if i have 1000 and they are sparsely populated (think an indexed
>>>>>table and i have indices 1, 23, 45, etc), then a bitmask would be
>>>>>hard to use.
>>>>
>>>>
>>>>I'm confused by this discussion.
>>>
>>>
>>>This is about the policy which states "install as many as you can, dont
>>>worry about failures". In such a case, how do you tell user space back
>>>"oh, btw you know your request #1, #23, and 45 went ok, but nothing else
>>>worked". A simple return code wont work. You could return a code to
>>>say "some worked". At which case user space could dump and find out only
>>>#1, #23 and #45 worked.
>>
>>You request for what? That's my confusion.
>
>Scott, you are gonna make do this all over again?;->
>The summary is there are three possible policies that could be
>identified by the user asking for a kernel operation.
>One use case example was to send a bunch of (for example)
>create/updates and request that the kernel should not abort on a
>failure of a single one but to keep going and create/update as many
>as possible. Is that part clear? I know it is not what you do,
>but there are use cases for that (Read John's response).
>Now assuming someone wants this and some entries failed;
>how do you tell user space back what was actually updated vs not?
>You could return a code which says "partial success".
>Forget whether the table is keyed or indexed but if you wanted
>to return more detailed info you would return an array/vector of some
>sort with status code per entry. Something netlink cant do.
>Is that a better description?
Sure this is something that is reasonable to request. But that would
require a major changes to userspace api. At this moment, when we are
using the existing api, I would leave this out for phase 1. Let this be
resolved later as a separate work. Does that make sense?
>
>>Are you trying to install
>>FDB entry into both SW and HW at same time?
>
>
>What is wrong with installing on both hardware and software? The
>point was to identify what kind of policies could be requested by
>the user; but even for the bridge why is it bad that i ask for
>both master&self?
>It is something I can do today with none of these patches.
>
>>And then do a bunch in a
>>batch? I'm saying use MASTER for SW and SELF for HW in two steps,
>
>But that would be enforcing your policy on me.
>
>>if
>>you want FDB entry installed in both Sw and HW. Check your return
>>code each step. Batch all to HW first, then batch all that PASSED to
>>SW. I don't even know really why you're trying to install to both HW
>>and SW. Install it to HW and be done. fdb_dump will set HW entries
>>via SELF.
>>
>
>First off: bad performance, but your call to do it that way
>(just please please dont enforce it on me;->)
>
>Lets take the hardware batching you mentioned above and see if
>i can help to clarify in the third policy choice (continue-on-failure).
>Lets say you have a keyed table such as the fdb table is.
>You send 10 entries to be created/added in hardware. #3 and #5 failed
>because you made a mistake and sent them with the same key. #9 and #10
>failed because the hardware doesnt have any more space.
>we didnt stop and go back for #3 and #5 because the user told
>us to continue and do the rest when we fail. And s/he did that because
>she wanted to put as many entries in hardware as possible without
>necessarily needing to know how much space exists.
>
>
>>Ah, Jamal, look again at patches 13-17/17 in last v3 set. That was a
>>big steaming snickerdoodle just for you! Now you can push policy
>>knobs down to port driver and or bridge to fine tune what ever you
>>want. You'll find knobs for learning, flooding, learning sync to hw,
>>etc. I thought you even ACKed some of these.
>
>I think it almost there.
>What you are missing is the policy decision to only sync when i
>say so. Having an ndo_ops is a necessity but i dont want the driver
>to decide for me just because it can ;->
>Telling hardware to learn is instructing it to self update its entries
>based on source lookup failure. That is distinctly different from
>telling to sync to the kernel. So if you add that knob we are in good
>shape.
>
>cheers,
>jamal
>
>>a) above knob is 14/17
>>patch, b) above is using existing learning knob on bridge, c) above I
>>don't get...no point in syncing that direction.
>>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists