lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 10 Feb 2012 10:18:31 -0500
From:	jamal <hadi@...erus.ca>
To:	John Fastabend <john.r.fastabend@...el.com>
Cc:	Stephen Hemminger <shemminger@...tta.com>,
	bhutchings@...arflare.com, roprabhu@...co.com,
	netdev@...r.kernel.org, mst@...hat.com, chrisw@...hat.com,
	davem@...emloft.net, gregory.v.rose@...el.com, kvm@...r.kernel.org,
	sri@...ibm.com
Subject: Re: [RFC PATCH v0 1/2] net: bridge: propagate FDB table into
 hardware

Hi John,

I went backwards to summarize at the top after going through your email.

TL;DR version 0.1: 
you provide a good use case where it makes sense to do things in the
kernel. IMO, you could make the same arguement if your embedded switch
could do ACLs, IPv4 forwarding etc. And the kernel bloats.
I am always bigoted to move all policy control to user space instead of
bloating in the kernel.

 
On Thu, 2012-02-09 at 20:14 -0800, John Fastabend wrote:

> > 
> > Hi Jamal,
> > 
> > The user space app in this case would listen for FDB updates to the SW
> > bridge and then mirror them at the embedded NIC. In this case it seems
> > easier to just add a notifier chain and let the kernel keep these in
> > sync. Otherwise we need a daemon in user space to replicate these.
> > 

A user space daemon if you need to ensure synchronization. Thats what i
meant when i said there was a "disadvantage" over the simple case when
the goal is always to synchronize.

> > On the other hand if you could make the same RTM_NEWNEIGH, RTM_DELNEIGH,
> > and RTM_GETNEIGH work for the bridge, embedded bridge, and macvlan you
> > would have one common interface to drive these. But the bridge already
> > has this protocol/msgtype so that would require either some demux or
> > new protocol/msgtype pairs to be created. 
> > 

The bridge is very netlink friendly these days. Given the rest of the
network stack (*NEIGH* you mention above) talks netlink to user space
it should be workable. 

> > Let me think on it. I'm tempted by the simplicity of adding notifier
> > hooks though.

If something is missing bridge-side it may need to be added (as Per
Stephen's comment) - i just took it one further indicating those
notifiers need to also netlink-speak


> Actually because the bridge is adding/removing fdb entries dynamically
> maybe its best this gets done in kernel. Here's the example case,

[..]

> 
> With the flow by letters above hope this is not too difficult to follow.

> (A) veth0 a virtual device transmits packet destined for ethx.y
> (B) SW bridge receives frames and updates FDB flooding to C
> (C) eth0 the PF in this case sends the frame to the HW backed by the
>     embedded bridge

Following so far.
Can you have more than one PF per embedded switch? Or is the intent here
purely to do VMs/VF separation?

> (D) The HW embedded switch has a static entry for ethx.y and forwards
>     the frame to the VF or if its a broadcast frame also floods it to
>     the wire and ethx.y

nod.

> (E) ethx.y receives the frame and generates a response to the dest mac of
>     veth0

nod.
Since you said in #D the entries in the switch are static, I am assuming
at this point neither ethx.y nor veth0 exist in the embedded FDB.

> Now here is the potential issue,
> 
> (G) The frame transmitted from ethx.y with the destination address of
>     veth0 but the embedded switch is not a learning switch. If the FDB
>     update is done in user space its possible (likely?) that the FDB
>     entry for veth0 has not been added to the embedded switch yet. 

Ok, got it - so the catch here is the switch is not capable of learning.
I think this depends on where learning is done. Your intent is to
use the S/W bridge as something that does the learning for you i.e in
the kernel. This makes the s/w bridge part of MUST-have-for-this-to-run.
And that maybe the case for your use case.

What if I dont wanna run the S/W bridge at all?
Ive been making a point that with a simple knob(Stephen doesn like to
add such a knob), the SW bridge could defer learning to user space. 
[This way you can add a lot of richness e.g on ACLs such as restricting
what MAC addresses etc are allowed to talk to which ones etc.].
But if bypass the s/w bridge all together and learn in user space
or have a static config in which i populate the embedded switch, i dont
see the issue.

> Now
>     we either have to flood the frame which is not horrible but not
>     ideal or worse if the embedded switch does not support flooding send
>     it to the wire and veth0 never receives it. 

If it is a switch it has to flood, no? Otherwise it sounds broken.

> If the SW bridge pushes
>     the FDB update down into the embedded switch the address is for
>     sure in the embedded switches forwarding tables and the switching
>     works as expected.

Yes, there is a small gap between the s/w bridge learning and the
synchronization happening to the embedded nic switch. That gap gets
larger if you defer learning to user space. But like you said earlier,
during that gap packets are flooded - and do you care if the
synchronization doesnt happen immediately?

> So to handle this case correctly its probably best IMHO to use a notifier
> hook. Having a RTM_GETNEIGH for the embedded switch implemented though
> would be nice for dumping the FDB of the embedded switch and SET/DEL
> could be used to configure the FDB when its not being driven by the SW
> switch. Of course we should try to be minimalists here.

Do you need to have a different *NEIGH* than what we already have
really?

The problem with putting policies in the kernel is you are gonna keep
adding more. Bloat user space instead. 

cheers,
jamal


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ