lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54200C8F.2040501@mojatatu.com>
Date:	Mon, 22 Sep 2014 07:48:31 -0400
From:	Jamal Hadi Salim <jhs@...atatu.com>
To:	Jiri Pirko <jiri@...nulli.us>
CC:	Thomas Graf <tgraf@...g.ch>,
	John Fastabend <john.r.fastabend@...el.com>,
	netdev@...r.kernel.org, davem@...emloft.net, nhorman@...driver.com,
	andy@...yhouse.net, dborkman@...hat.com, ogerlitz@...lanox.com,
	jesse@...ira.com, pshelar@...ira.com, azhou@...ira.com,
	ben@...adent.org.uk, stephen@...workplumber.org,
	jeffrey.t.kirsher@...el.com, vyasevic@...hat.com,
	xiyou.wangcong@...il.com, edumazet@...gle.com,
	sfeldma@...ulusnetworks.com, f.fainelli@...il.com,
	roopa@...ulusnetworks.com, linville@...driver.com,
	dev@...nvswitch.org, jasowang@...hat.com, ebiederm@...ssion.com,
	nicolas.dichtel@...nd.com, ryazanov.s.a@...il.com,
	buytenh@...tstofly.org, aviadr@...lanox.com, nbd@...nwrt.org,
	alexei.starovoitov@...il.com, Neil.Jerram@...aswitch.com,
	ronye@...lanox.com, simon.horman@...ronome.com,
	alexander.h.duyck@...el.com
Subject: Re: [patch net-next v2 8/9] switchdev: introduce Netlink API

On 09/22/14 03:53, Jiri Pirko wrote:

> Jamal, would you please give us some examples on how to use tc to work
> with flows? I have a feeling that you see something other people does not.

I will be a little verbose so as to avoid knowledge assumption.

Lets talk about tc classifier/action subsystem because that is what
would take advantage of flows. We could also talk about qdiscs i.e
schedulers and queue objects because the two are often related
(the default classification action is "classid" which typically
maps to a queue class).

tc classification/action subsystem allows you to specify arbitrary
classifiers and actions.
You can then specify (using a precise BNF grammar) how filters and
actions are to be related.
Look at iproute2/f_*.c to see the currently defined ones.

Each classifier has a name/id and attributes/options specific to
itself. Classifiers dont necessarily have to filter on packet
headers; they could filter on metadata for example.
Each classifier running in software may be offloaded. I think that
simple model would allow usable tools.
The classifier you have defined currently in your patches could
be realized via the u32 classifier but i think that would
require knowledge of u32. So for usability reasons I would
suggest to write a brand new classifier. For lack of a better
name, lets call it "multi-tuple classifier".
I would expect this classifier to be usable in software tc as
well without necessarily being offloaded.

There are two important details to note:
1) many different types of classifiers exist. This would very
likely depend on hardware implementation. It is academic bullshit
(i.e not pragmatic) to claim all hardware offload can use the
same classification language. As i was telling Thomas
I dont see why one wouldnt offload the defined bpf classifier.
 From an API level, this means your ->flow_add/del/get would have
to support ability to define different classifiers.

2) Each classifier will have different semantics.
 From a device API level this means you have to allow the different
classifiers to pass attributes specific to them. This means
each classifier may override the ops(). I am indifferent how
it is achieved. So while you could pass one big structure
such as your flow struct, one should be able to do u32
kind of semantics.

We also need to discover which device supports which classifiers
and what constraints exist in the hardware implementation exist
(we can talk about that because it is important). Example
if one supports u32, how many u32 rules can be offloaded etc.

As to how it is to be implemented:
I like the semantics of the current bridge code. I have always
wondered why we didnt use that scheme for offloading qdiscs.
Each device supporting FDB offload has an ->fdb_add/del/get
(dont quote me on the naming). User space describes what
it wants. If something is to be offloaded we already know the
netdev the user is pointing to. We invoke the appropriate
->flow() calls with appropriately cooked structures.
I am not sure i like that we pass the netlink structure as Scott
often seems to point to; i think that passing the internal
structure we would install in s/ware may be the better approach
since:
a) we would need to parse the data anyways for validation etc
b) each hardware offload will likely need to translate further in
internal format
c)we have well defined mapping between user and offload,
the generic structure will be very close to hardware.
note: that is what the fdb offload does.

Note: I described this using tc, but i dont see why nftable
couldnt follow the same approach. My angle is that we dont
impede other users by over-focussing on ovs and whatever
other things that surround it.
cheers,
jamal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ