[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55127F8D.3070309@cloudius-systems.com>
Date: Wed, 25 Mar 2015 11:27:41 +0200
From: Vlad Zolotarov <vladz@...udius-systems.com>
To: "Tantilov, Emil S" <emil.s.tantilov@...el.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC: "Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>,
"avi@...udius-systems.com" <avi@...udius-systems.com>,
"gleb@...udius-systems.com" <gleb@...udius-systems.com>,
"Skidmore, Donald C" <donald.c.skidmore@...el.com>
Subject: Re: [PATCH net-next v6 4/7] ixgbevf: Add a RETA query code
On 03/24/15 23:04, Tantilov, Emil S wrote:
>> -----Original Message-----
>> From: Vlad Zolotarov [mailto:vladz@...udius-systems.com]
>> Sent: Tuesday, March 24, 2015 12:06 PM
>> Subject: Re: [PATCH net-next v6 4/7] ixgbevf: Add a RETA query code
>> I'm not sure where you see this, on my setup ixgbevf_get_queues() gets 4 in msg[IXGBE_VF_R/TX_QUEUES] which is used to set hw->mac.max_t/rx_queues.
>>
>> Right. I misread the __ALIGN_MASK() macro.
>> But then I can't see where max_rx_queues is used.
> It is used when stopping the Tx/Rx queues in ixgbevf_stop_hw_vf().
Of course, I meant it's not used during the adapter->num_rx_queues value
calculation. Let's move this discussion to the appropriate patch thread.
>
>> ixgbevf_set_num_queues() sets adapter->num_rx_queues ignoring the above
>> value if num_tcs less or equal to 1:
>>
>> /* fetch queue configuration from the PF */
>> err = ixgbevf_get_queues(hw, &num_tcs, &def_q);
>>
>> spin_unlock_bh(&adapter->mbx_lock);
>>
>> if (err)
>> return;
>>
>> /* we need as many queues as traffic classes */
>> if (num_tcs > 1) {
>> adapter->num_rx_queues = num_tcs;
>> } else {
>> u16 rss = min_t(u16, num_online_cpus(), IXGBEVF_MAX_RSS_QUEUES);
>>
>> switch (hw->api_version) {
>> case ixgbe_mbox_api_11:
>> case ixgbe_mbox_api_12:
>> adapter->num_rx_queues = rss;
>> adapter->num_tx_queues = rss;
>> default:
>> break;
>> }
>> }
>>
>> This means that if PF returned in IXGBE_VF_RX_QUEUES 1 and if u have
>> more than 1 CPU u will still go and configure 2 Rx queues for a VF. This
>> unless I miss something again... ;)
> From what I can see vmdq->mask can be only for 4 or 8 queues, so the PF will not return 1, unless you meant something else.
This is only if PF is driven by a Linux ixgbe PF driver as it is in the
upstream tree right now. AFAIK VF driver should be completely decoupled
from the PF driver and all the configuration decisions should be made
based on the VF-PF communication via VF-PF channel (mailbox).
Pls., see my comments on the thread of your patches that have added
these lines.
>
> The comment about the 1 queue is outdated though - I think it's leftover from the time the VF only allowed single queue.
Looks like it. ;)
>
>>> BTW - there are other issues with your patches. The indirection table seems to come out as all 0s and the VF driver reports link >> down/up when querying it.
>> Worked just fine to me on x540.
>> What is your setup? How did u check it? Did u remember to patch "ip"
>> tool and enable the querying?
> I have x540 as well and used the modified ip, otherwise the operation won't be allowed. I will do some more debugging when I get a chance and will get back to you.
Pls., make sure u use v7 patches.
One problem that there may still be is that I may have protected the
mbox access by adapter->mbx_lock spinlock similarly to
ixgbevf_set_num_queues() since I don't think ethtool ensures the atomicy
of its requests in a context of a specific device and thus mbox could be
trashed by ethtool operations running in parallel on different CPUs.
>
>>> Where is this information useful anyway - what is the use case? There is no description in your patches for why all this is >>needed.
>> I'm not sure it's required to explain why would I want to add a standard
>> ethtool functionality to a driver.
>> However, since u've asked - we are developing a software infrastructure
>> that needs the ability to query the VF's indirection table and RSS hash
>> key from inside the Guest OS (particularly on AWS) in order to be able
>> to open the socket the way that its traffic would arrive to a specific CPU.
> Understanding the use case can help with the implementation. Because you need to get the RSS info from the PF for macs < x550 this opens up the possibility to abuse (even if inadvertently) the mailbox if someone decided to query that info in a loop - like we have seen this happen with net-snmp.
>
> Have you tested what happens if you run:
>
> while true
> do
> ethtool --show-rxfh-indir ethX
> done
>
> in the background while passing traffic through the VF?
I understand your concerns but let's start with clarifying a few things.
First, VF driver is by definition not trusted. If it (or its user)
decides to do anything malicious (like u proposed above) that would
eventually hurt (only this) VF's performance - nobody should care.
However the right question here would be: "How the above use case may
hurt the corresponding PF or other VFs' performance?" And since the
mailbox operation involves quite a few MMIO writes and reads this may
slow the PF quite a bit and this may be a problem that should be taken
care of. However it wasn't my patch series that have introduced it. The
same problem would arise if Guest would change VF's MAC address in a
tight loop like above. Namely any VF slow path operation that would
eventually cause the VF-PF channel transaction may be used to create an
attack on a PF.
This problem naturally may not be resolved on a VF level but only on a
PF level since VF is not a trusted component. One option could be to
allow only a specific number of mbox operations from a specific VF in a
specific time period: e.g. at most 1 operation every jiffy.
I don't see any code handling any protection of this sort in a PF driver
at the moment. However I maybe missing some HW configuration that limits
the slow path interrupts rate thus limiting the number of mbox requests
rate...
>
> Perhaps storing the RSS key and the table is better option than having to invoke the mailbox on every read.
I don't think this could work if I understand your proposal correctly.
The only way to cache the result that would decrease the number of mbox
transactions would be to cache it in the VF. But how could i invalidate
this cache if the table content has been changed by a PF? I think the
main source of a confusion here is that u assume that PF driver is a
Linux ixgbe driver that doesn't support an indirection table change at
the moment. As I have explained above - this should not be assumed.
thanks,
vlad
>
> Thanks,
> Emil
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists