[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5475B952.2080500@mojatatu.com>
Date:	Wed, 26 Nov 2014 06:28:18 -0500
From:	Jamal Hadi Salim <jhs@...atatu.com>
To:	Scott Feldman <sfeldma@...il.com>
CC:	John Fastabend <john.r.fastabend@...el.com>,
	Jiri Pirko <jiri@...nulli.us>, Netdev <netdev@...r.kernel.org>,
	"David S. Miller" <davem@...emloft.net>,
	"nhorman@...driver.com" <nhorman@...driver.com>,
	Andy Gospodarek <andy@...yhouse.net>,
	Thomas Graf <tgraf@...g.ch>,
	"dborkman@...hat.com" <dborkman@...hat.com>,
	"ogerlitz@...lanox.com" <ogerlitz@...lanox.com>,
	"jesse@...ira.com" <jesse@...ira.com>,
	"pshelar@...ira.com" <pshelar@...ira.com>,
	"azhou@...ira.com" <azhou@...ira.com>,
	"ben@...adent.org.uk" <ben@...adent.org.uk>,
	"stephen@...workplumber.org" <stephen@...workplumber.org>,
	"Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>,
	"vyasevic@...hat.com" <vyasevic@...hat.com>,
	Cong Wang <xiyou.wangcong@...il.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Florian Fainelli <f.fainelli@...il.com>,
	Roopa Prabhu <roopa@...ulusnetworks.com>,
	John Linville <linville@...driver.com>,
	"jasowang@...hat.com" <jasowang@...hat.com>,
	"ebiederm@...ssion.com" <ebiederm@...ssion.com>,
	Nicolas Dichtel <nicolas.dichtel@...nd.com>,
	"ryazanov.s.a@...il.com" <ryazanov.s.a@...il.com>,
	"buytenh@...tstofly.org" <buytenh@...tstofly.org>,
	Aviad Raveh <aviadr@...lanox.com>,
	"nbd@...nwrt.org" <nbd@...nwrt.org>,
	Alexei Starovoitov <alexei.starovoitov@...il.com>,
	Neil Jerram <Neil.Jerram@...aswitch.com>,
	"ronye@...lanox.com" <ronye@...lanox.com>,
	"simon.horman@...ronome.com" <simon.horman@...ronome.com>,
	"alexander.h.duyck@...hat.com" <alexander.h.duyck@...hat.com>,
	"Ronciak, John" <john.ronciak@...el.com>,
	"mleitner@...hat.com" <mleitner@...hat.com>,
	Shrijeet Mukherjee <shrijeet@...il.com>,
	Andy Gospodarek <gospo@...ulusnetworks.com>,
	Benjamin LaHaise <bcrl@...ck.org>
Subject: Re: [patch net-next v3 02/17] net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del
On 11/25/14 22:59, Scott Feldman wrote:
> On Tue, Nov 25, 2014 at 5:19 PM, Jamal Hadi Salim <jhs@...atatu.com> wrote:
>> On 11/25/14 21:36, Scott Feldman wrote:
>>
>>>> Ok, guess i am gonna have to go stare at the code some more.
>>>> I thought we returned one of the error codes?
>>>> A bitmask would work for a single entry - because you have two
>>>> options add to h/ware and/or s/ware. So response is easy to encode.
>>>> But if i have 1000 and they are sparsely populated (think an indexed
>>>> table and i have indices 1, 23, 45, etc), then a bitmask would be
>>>> hard to use.
>>>
>>>
>>> I'm confused by this discussion.
>>
>>
>> This is about the policy which states "install as many as you can, dont
>> worry about failures". In such a case, how do you tell user space back
>> "oh, btw you know your request #1, #23, and 45 went ok, but nothing else
>> worked". A simple return code wont work. You could return a code to
>> say "some worked". At which case user space could dump and find out only
>> #1, #23 and #45 worked.
>
> You request for what?  That's my confusion.
Scott, you are gonna make do this all over again?;->
The summary is there are three possible policies that could be
identified by the user asking for a kernel operation.
One use case example was to send a bunch of (for example)
create/updates and request that the kernel should not abort on a
failure of a single one but to keep going and create/update as many
as possible. Is that part clear? I know it is not what you do,
but there are use cases for that (Read John's response).
Now assuming someone wants this and some entries failed;
how do you tell user space back what was actually updated vs not?
You could return a code which says "partial success".
Forget whether the table is keyed or indexed but if you wanted
to return more detailed info you would return an array/vector of some
sort with status code per entry. Something netlink cant do.
Is that a better description?
> Are you trying to install
> FDB entry into both SW and HW at same time?
What is wrong with installing on both hardware and software? The
point was to identify what kind of policies could be requested by
the user; but even for the bridge why is it bad that i ask for
both master&self?
It is something I can do today with none of these patches.
> And then do a bunch in a
> batch?  I'm saying use MASTER for SW and SELF for HW in two steps,
But that would be enforcing your policy on me.
> if
> you want FDB entry installed in both Sw and HW.  Check your return
> code each step.  Batch all to HW first, then batch all that PASSED to
> SW.  I don't even know really why you're trying to install to both HW
> and SW.  Install it to HW and be done. fdb_dump will set HW entries
> via SELF.
>
First off: bad performance, but your call to do it that way
(just please please dont enforce it on me;->)
Lets take the hardware batching you mentioned above and see if
i can help to clarify in the third policy choice (continue-on-failure).
Lets say you have a keyed table such as the fdb table is.
You send 10 entries to be created/added in hardware. #3 and #5 failed
because you made a mistake and sent them with the same key. #9 and #10
failed because the hardware doesnt have any more space.
we didnt stop and go back for #3 and #5 because the user told
us to continue and do the rest when we fail. And s/he did that because
she wanted to put as many entries in hardware as possible without
necessarily needing to know how much space exists.
> Ah, Jamal, look again at patches 13-17/17 in last v3 set.  That was a
> big steaming snickerdoodle just for you!  Now you can push policy
> knobs down to port driver and or bridge to fine tune what ever you
> want.  You'll find knobs for learning, flooding, learning sync to hw,
> etc.  I thought you even ACKed some of these.
I think it almost there.
What you are missing is the policy decision to only sync when i
say so. Having an ndo_ops is a necessity but i dont want the driver
to decide for me just because it can ;->
Telling hardware to learn is instructing it to self update its entries
based on source lookup failure. That is distinctly different from
telling to sync to the kernel. So if you add that knob we are in good
shape.
cheers,
jamal
> a) above knob is 14/17
> patch, b) above is using existing learning knob on bridge, c) above I
> don't get...no point in syncing that direction.
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists
 
