lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56B9CC6B.6050509@gmail.com>
Date:	Tue, 9 Feb 2016 03:24:27 -0800
From:	"Fastabend, John R" <john.fastabend@...il.com>
To:	Jamal Hadi Salim <jhs@...atatu.com>,
	Or Gerlitz <ogerlitz@...lanox.com>
Cc:	amir@...ai.me, jiri@...nulli.us, jeffrey.t.kirsher@...el.com,
	netdev@...r.kernel.org, davem@...emloft.net
Subject: Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

On 2/4/2016 5:12 AM, Jamal Hadi Salim wrote:
> 
> On 16-02-03 01:48 PM, Fastabend, John R wrote:
> 
> BTW: For the record John, I empathize with you that we need to
> move. Please have patience - we are close; lets just get this resolved
> in Seville. I like your patches a lot and would love to just have
> your patches pushed in, but the challenges with community is being able
> to reach some middle ground. We are not as bad as some of the standards
> organizations. I am sure we'll get this resolved by end of next week
> if not, I am %100 in agreement some form of your patches (And Amir's
> need to go in and then we can refactor as needed)

Agreed although I'm a bit worried we are starting to talk about a
single hardware IR. This discussion has always failed in my experience.

> 
>>> 1) "priorities" for filters and some form of "index" for actions is
>>> is needed. I think index (which tends to be a 32 bit value is what
>>> Amir's patches refered to as "cookie" - or at least some hardware
>>> can be used to query the action with). Priorities maybe implicit in
>>> the order in which they are added. And th idea of appending vs
>>> exclusivity vs replace (which  netlink already supports)
>>> is important to worry about (TCAMS tend to assume an append mode
>>> for example).
>>
>> The code denotes add/del/replace already. I'm not sure why a TCAM
>> would assume an append mode but OK maybe that is some API you have
>> the APIs I use don't have these semantics.
>>
> 
> Basically most hardware (or i should say driver implementations of
> mostly TCAMS) allow you to add exactly the same filter as many times
> as you want. They dont really look at what you want to filter on
> and then scream "conflict". IOW, you (user) are responsible for
> conflict resolution at the filter level. The driver sees this blob
> and requests for some index/key from the hardware then just adds it.
> You can then use this key/index to delete/replace etc.
> This is what i meant by "append" mode.
> However if a classifier implementation cares about filter ambiguity
> resolution, then priorities are used. We need to worry about the
> bigger picture.
> 

Sure in other classifiers its used but its not needed in the set I
planned to added it later.

> 
>> For this series using cls_u32 the handle gives you everything you need
>> to put entries in the right table and row. Namely the ht # and order #
>> from 'tc'.
> 
> True - but with a caveat. There are only 2^12 max tables you can
> have for example and up to 2^12 filters per bucket etc.
> 

This is a software limitation as well right? If it hasn't showed up
as a limitation on the software side why would it be an issue here?
Do you have more than 2^12 tables on your devices? If so I guess we
can tack on another 32bits somewhere.

>> Take a look at u32_change and u32_classify its the handle
>> that places the filter into the list and the handle that is matched in
>> classify. We should place the filters in the hardware in the same order
>> that is used by u32_change.
>>
> 
> I can see some parallels, but:
> The nodeid in itself is insufficent for two reasons:
> You cant have more than 2^12 filters per bucket;
> and the nodeid then takes two meanings: a) it is an id
> b) it specifies the order in which things are looked up.
> 
> I think you need to take the u32 address and map it to something in your
> hardware. But at the same time it is important to have the abstraction
> closely emulate your hardware.
> 

IMO the hardware/interface must preserve the same ordering of
filters/hash_Tables/etc. How it does that mapping should be
a driver concern and it can always abort if it fails.

>> Also ran a few tests and can't see how priority works in u32 maybe you
>> can shed some light but as best I can tell it doesn't have any effect
>> on rule execution.
>>
> 
> True.
> u32 doesnt care because it will give you a nodeid if you dont specify
> one. i.e conflict resolution is mapped to you not specifying exactly
> the same ht:bkt:nodeid more than once. And if you will let the
> kernel do it for you (as i am assumming you are saying your hardware
> will) then no need.

Yep. Faithfully offloading u32 here not changing anything except
I do have to abort on some cases with the simpler devices. fm10k for
example can model hash nodes with divisors > 1.

> 
>>>
>>> 2) I like the u32 approach where it makes sense; but sometimes it
>>> doesnt make sense from a usability pov. I work with some ASICs
>>> that have 10 tuples that are  fixed. Yes, a user can describe a policy
>>> with u32 but flower would be more  usable say with flower (both
>>> programmatic and cli)
>>
>> Sure so create a set of offload hooks for flower we don't need only
>> one hardware classifier any more than we would like a single software
>> classifiers.
> 
> 
> Glad to hear that.
> I was a little concerned that despite my love for u32 it was
> going to be _the_ classifier. It doesnt fit for all offload cases
> and sometimes it is because of human operators (the 10 tuple
> hardware classifier i mentioned earlier).
> BTW: Classifier in this case is very wide ranging (a regex hardware
> offload for example qualifies).

My issue is we can map flower onto u32 that is fine and u32 onto
bpf. But we lose a lot of the power of each classifier when we
do this. flower for example is nice because of its simplicity
presumably this translates into faster updates, u32 is great because
we get full parse graph support and hash tables, ebpf is the biggest
beast of all and lets us load arbitrary functions into the device.
All are nice in their own right.

> 
>>
>> Again I'm trying to faithfully implement what we have in software
>> and load that into the hardware. The handle today gives ingress/egres
>> hook. If you want an all ports hook we should add it to 'tc' software
>> first and then push that to the hardware not create magic hardware
>> bits. See I've drank the cool aid software first than hardware.
>>
> 
> ;-> No disagreement. It felt like a small sensible
> change - thats why i suggested it.
>

Yep its in the git log if we can get past this initial series.


>>> 4) Why are we forsaking switchdev John?
>>> This is certainly re-usable beyond NICs and SRIOV.
>>>
>>
>> Sure and switchdev can use it just like they use fdb_add and friends.
>> I just don't want to require switchdev infrastructure on things that
>> really are not switches. I think Amir indicated he would take a try
>> at the switchdev integration. If not I'm willing to do it but it
>> doesn't block this series in any way imo.
>>
> 
> Ok. Makes sense.
> 

Great!

>>> 5)What happened to being both able to hardware and/or software?
>>
>> Follow up patch once we get the basic infrastructure in place with
>> the big feature flag bit. I have a patch I'm testing for this now
>> but again I want to move in logical and somewhat minimal sets.
>>
> 
> Sounds sensible.
> 
>>>
>>> Anyways, I think Seville would be a blast! Come one, come all.
>>>
>>
>> I'll be there but lets be sure to follow up with this online I
>> know folks are following this who wont be at Seville and I don't
>> see any reason to block these patches and stop the thread for a
>> week or more.
>>
> 
> I really dont see much of a blocker.

Perfect hopefully it didn't get thrashed on too much last couple
days. I'll be in Seville in a couple hours!

> 
> cheers,
> jamal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ