lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111130230049.GE29071@x200.localdomain>
Date:	Wed, 30 Nov 2011 15:00:49 -0800
From:	Chris Wright <chrisw@...hat.com>
To:	Ben Hutchings <bhutchings@...arflare.com>
Cc:	Chris Wright <chrisw@...hat.com>,
	Greg Rose <gregory.v.rose@...el.com>,
	Roopa Prabhu <roprabhu@...co.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"sri@...ibm.com" <sri@...ibm.com>,
	"dragos.tatulea@...il.com" <dragos.tatulea@...il.com>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"arnd@...db.de" <arnd@...db.de>, "mst@...hat.com" <mst@...hat.com>,
	"mchan@...adcom.com" <mchan@...adcom.com>,
	"dwang2@...co.com" <dwang2@...co.com>,
	"shemminger@...tta.com" <shemminger@...tta.com>,
	"eric.dumazet@...il.com" <eric.dumazet@...il.com>,
	"kaber@...sh.net" <kaber@...sh.net>,
	"benve@...co.com" <benve@...co.com>
Subject: Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering
 support for passthru mode

* Ben Hutchings (bhutchings@...arflare.com) wrote:
> On Wed, 2011-11-30 at 13:04 -0800, Chris Wright wrote:
> > I agree that it's confusing.  Couldn't you simplify your ascii art
> > (hopefully removing hw assumptions about receive processing, and
> > completely ignoring vlans for the moment) to something like:
> >
> >              |RX
> >              v
> > +------------+-------------+
> > |     +------+--------+    |
> > |     | RX MAC filter |    |
> > |     |and port select|    |
> > |     +---------------+    |
> > |            /|\           |
> > |           / | \   match 2|
> > |          /  v  \         |
> > |         /match  \        |
> > |        /  1 |    \       |
> > |       /     |     \      |
> > |match /      |      \     |
> > |  0  /       |       \    |
> > |    v        |        v   |
> > |    |        |        |   |
> > +----+--------+--------+---+
> >      |        |        |
> >     PF       VF 1     VF 2
> > 
> > And there's an unclear number of ways to update "RX MAC filter and port
> > select" table.
> > 
> > 1) PF ndo_set_mac_addr
> > I expect that to be implicit to match 0.
> > 
> > 2) PF ndo_set_rx_mode
> > Less clear, but I'd still expect these to implicitly match 0
> > 
> > 3) PF ndo_set_vf_mac
> > I expect these to be an explicit match to VF N (given the interface
> > specifices which VF's MAC is being programmed).
> 
> I'm not sure whether this is supposed to implicitly add to the MAC
> filter or whether that has to be changed too.  That's the main
> difference between my models (a) and (b).

I see now.  I wasn't entirely clear on the difference before.  It's also
going to be hw specific.  I think (Intel folks can verify) that the
Intel SR-IOV devices have a single global unicast exact match table,
for example.

> There's also PF ndo_set_vf_vlan.

Right, although I had mentioned I was trying to limit just to MAC
filtering to simplify.

> > 4) VF ndo_set_mac_addr
> > This one may or may not be allowed (setting MAC+port if the VF is owned
> > by a guest is likely not allowed), but would expect an implicit VF N.
> > 
> > 5) VF ndo_set_rx_mode
> > Same as 4) above.
> 
> So this is where we are today.

Cool, good that we agree there.

> > 6) PF or VF? ndo_set_rx_filter_addr
> > The new proposal, which has an explicit VF, although when it's VF_SELF
> > I'm not clear if this is just the same as 5) above?
> > 
> > Have I missed anything?
> 
> Any physical port can be bridged to a mixture of guests with and without
> their own VFs.  Packets sent from a guest with a VF to the address of a
> guest without a VF need to be forwarded to the PF rather than the
> physical port, but none of the drivers currently get to know about those
> addresses.

To clarify, do you mean something like this?

       physical port
             |
+------------+------------+
|         +-----+         |
|         | VEB |         |
|         +-----+         |
|        /   |   \        |
|       /    |    \       |
|      /     |     \      |
+-----+------+------+-----+
      |      |       |
     PF    VF 1    VF 2
     /       |       | 
 +---+---+  VM4  +---+---+
 |  sw   |       |macvtap|
 | switch|       +---+---+
 +-+-+-+-+           |
   / | \            VM5
  /  |  \
VM1 VM2 VM3

This has VMs 1-3 hanging of the PF via a linux bridge (traditional hv
switching), VM4 directly owning VF1 (pci device assignement), and VM5
indirectly owning VF2 (macvtap passthrough, that started this whole
thing).

So, I'm understanding you saying that VM4 or VM4 sending a packet to VM1
goes in to VEB, out PF, and into linux bridging code, rigth?  At which
point the PF is in promiscuous mode (btw, same does not work if bridge is
attached to VF, at least for some VFs, due to lack of promiscuous mode).

> Packets sent from a guest with a VF to the address of another guest with
> a VF need to be forwarded similarly, but the driver should be able to
> infer that from (3).

Right, and that works currently for the case where both guests are like
VM4, they directly own the VF via PCI device assignement.  But for VM4
to talk to VM5, VF3 is not in promiscuous mode and has a different MAC
address than VM5's vNIC.  If the embedded bridge does not learn, and
nobody programmed it to fwd frames for VM5 via VF3...

I believe this is what Roopa's patch will allow.  The question now is
whether there's a better way to handle this?

In my mind, we'd model the NIC's embedded bridge as, well, a bridge.
And set anti-spoofing, port mirroring, port mac/vlan filtering, etc via
that bridge.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ