linux-kernel - Re: [E1000-devel] [PATCH 1/2] if_link: Add VF multicast promiscuous mode control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54C1F5BF.30105@gmail.com>
Date:	Thu, 22 Jan 2015 23:18:23 -0800
From:	Alexander Duyck <alexander.duyck@...il.com>
To:	Hiroshi Shimamoto <h-shimamoto@...jp.nec.com>,
	"Skidmore, Donald C" <donald.c.skidmore@...el.com>,
	Bjørn Mork <bjorn@...k.no>
CC:	"e1000-devel@...ts.sourceforge.net" 
	<e1000-devel@...ts.sourceforge.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"Choi, Sy Jong" <sy.jong.choi@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	David Laight <David.Laight@...LAB.COM>,
	Hayato Momma <h-momma@...jp.nec.com>
Subject: Re: [E1000-devel] [PATCH 1/2] if_link: Add VF multicast promiscuous
 mode control

On 01/22/2015 04:32 PM, Hiroshi Shimamoto wrote:
>> Subject: RE: [E1000-devel] [PATCH 1/2] if_link: Add VF multicast promiscuous mode control
>>> "Skidmore, Donald C" <donald.c.skidmore@...el.com> writes:
>>>
>>>> My hang up is more related to: without the nob to enable it (off by
>>>> default) we are letting one VF dictate policy for all the other VFs
>>>> and the PF.  If one VF needs to be in promiscuous multicast so is
>>>> everyone else.  Their stacks now needs to deal with all the extra
>>>> multicast packets.  As you point out this might not be a direct
>>>> concern for isolation in that the VM could have 'chosen' to join any
>>>> Multicast group and seen this traffic.  My concern over isolation is
>>>> one VF has chosen that all the other VM now have to see this multicast
>>>> traffic.
>>> Apologies if this question is stupid, but I just have to ask about stuff I don't
>>> understand...
>>>
>>> Looking at the proposed implementation, the promiscous multicast flag
>>> seems to be a per-VF flag:
>>>
>>> +int ixgbe_ndo_set_vf_mc_promisc(struct net_device *netdev, int vf, bool
>>> +setting) {
>>> +	struct ixgbe_adapter *adapter = netdev_priv(netdev);
>>> +	struct ixgbe_hw *hw = &adapter->hw;
>>> +	u32 vmolr;
>>> +
>>> +	if (vf >= adapter->num_vfs)
>>> +		return -EINVAL;
>>> +
>>> +	adapter->vfinfo[vf].mc_promisc_enabled = setting;
>>> +
>>> +	vmolr = IXGBE_READ_REG(hw, IXGBE_VMOLR(vf));
>>> +	if (setting) {
>>> +		e_info(drv, "VF %u: enabling multicast promiscuous\n", vf);
>>> +		vmolr |= IXGBE_VMOLR_MPE;
>>> +	} else {
>>> +		e_info(drv, "VF %u: disabling multicast promiscuous\n", vf);
>>> +		vmolr &= ~IXGBE_VMOLR_MPE;
>>> +	}
>>> +
>>> +	IXGBE_WRITE_REG(hw, IXGBE_VMOLR(vf), vmolr);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>>
>>> I haven't read the data sheet, but I took a quick look at the excellent high
>>> level driver docs:
>>> http://www.intel.com/content/dam/doc/design-guide/82599-sr-iov-driver-
>>> companion-guide.pdf
>>>
>>> It mentions "Multicast Promiscuous Enable" in its "Thoughts for
>>> Customization" section:
>>>
>>>  7.1 Multicast Promiscuous Enable
>>>
>>>  The controller has provisions to allow each VF to be put into Multicast
>>> Promiscuous mode.  The Intel reference driver does not configure this
>>> option .
>>>
>>>  The capability can be enabled/disabled by manipulating the MPE field  (bit
>>> 28) of the PF VF L2 Control Register (PFVML2FLT – 0x0F000)
>>>
>>> and showing a section from the data sheet describing the "PF VM L2 Control
>>> Register - PFVML2FLT[n]  (0x0F000 + 4 * n, n=0...63; RW)"
>>>
>>> To me it looks like enabling Promiscuos Multicast for a VF won't affect any
>>> other VF at all.  Is this really not the case?
>>>
>>>
>>>
>>> Bjørn
>> Clearly not a dumb question at all and I'm glad you mentioned that. :)  I was going off the assumption, been awhile since
>> I read the patch, that the patch was using FCTRL.MPE or MANC.MCST_PASS_L2 which would turn multicast promiscuous on for
>> everyone.  Since the patch is using PFVML2FLT.MPE this lessens my concern over effect on the entire system.
> I believe the patches for this VF multicast promiscuous mode is per VF.
>
>> That said I still would prefer having a way to override this behavior on the PF, although I admit my argument is weaker.
>> I'm still concerned about a VF changing the behavior of the PF without any way to prevent it.  This might be one part
>> philosophical (PF sets policy not the VF) but this still could have a noticeable effect on the overall system.  If any
>> other VFs (or the PF) are receiving MC packets these will have to be replicated which will be a performance hit.  When
>> we use the MC hash this is limited vs. when anyone is in MC promiscuous every MC packet used by another pool would be
>> replicated.  I could imagine in some environments (i.e. public clouds) where you don't trust what is running in your VM
>> you might what to block this from happening.
> I understand your request and I'm thinking to submit the patches
>   1) Add new mbox API between ixgbe/ixgbevf to turn MC promiscuous on,
>      and enables it when ixgbevf needs over 30 MC addresses.
>   2) Add a policy knob to prevent enabling it from the PF.
>
> Does it seem okay?

I would advise that if such a knob is added it should be set to disabled
by default.  The main problem with supporting that many multicast
addresses per VF is that you run the risk for flooding the PCIe
interface on the network adapter if too many adapters were to go into
such a mode.

> BTW, I'm bit worried about to use ndo interface for 2) because adding a
> new hook makes core code complicated.
> Is it really reasonable to do it with ndo?
> I haven't find any other suitable method to do it, right now. And using
> ndo VF hook looks standard way to control VF functionality.
> Then, I think it's the best way to implement this policy in ndo hook.

The ndo is probably the only way to go on this.  It is how past controls
for the VF network functionality have been implemented.

>> In some ways it is almost the mirror image of the issue you brought up:
>>
>> Adding a new hook for this seems over-complicated to me.  And it still
>> doesn't solve the real problems that
>>  a) the user has to know about this limit, and
>>  b) manually configure the feature
>>
>> My reverse argument might be that if this happens automatically.  It might take the VM provider a long time to realize
>> performance has taken a hit because some VM asked to join 31 multicast groups and entered MC promiscuous.  Then only to
>> find that they have no way to block such behavior.
>>
>> Maybe I wouldn't as concerned if the patch author could provide some performance results to show this won't have as a
>> negative effect as I'm afraid it might?
> Hm, what kind of test case would you like to have?
> The user A who has 30 MC addresses vs the user B who has 31 MC addresses, which
> means that MC promiscuous mode, and coming MC packets the user doesn't expect?
>
> thanks,
> Hiroshi

The question I would have is how many of these interfaces do you expect
to have supporting the expanded multicast mode?  As it currently stands
multicast traffic has the potential to flood the adapter, greatly reduce
the overall throughput, and add extra workload to the PF and all VFs. 
For example if several VFs enable this feature, and then someone on the
network sends a stream of multicast traffic what happens to the CPU load
for the host system?

Also how many addresses beyond 30 is it you require?  An alternative to
adding multicast promiscuous might be to consider extending the mailbox
API to support sending more than 30 addresses via something such as a
multi-part multicast configuration message.  The fm10k driver already
has logic similar to this as it adds addresses 1 at a time though a
separate Switch interface API between the PF and the Switch.  You might
be able to reuse some of that code to reach beyond the 30 address limit.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/